Author: Uri Blass
Date: 07:50:36 08/24/01
Go up one level in this thread
On August 24, 2001 at 10:34:01, José de Jesús García Ruvalcaba wrote: >On August 24, 2001 at 10:16:48, Uri Blass wrote: > >>On August 24, 2001 at 10:06:32, Miguel A. Ballicora wrote: >> >>>On August 24, 2001 at 07:51:16, Uri Blass wrote: >>> >>>>On August 24, 2001 at 07:29:21, Günther Simon wrote: >>>> >>>>>On August 24, 2001 at 07:15:30, Uri Blass wrote: >>>>> >>>>>>On August 24, 2001 at 07:06:51, Uri Blass wrote: >>>>>> >>>>>>>Here are the results by >>>>>>>elostat program >>>>>>> >>>>>>>You can see that shredder is only 3th place micro based on the performance. >>>>>>>Shredder is the world Micro champion by definition but Tiger and Rebel had a >>>>>>>better performance. >>>>>>> >>>>>>> >>>>>>>1 Deep Junior 7 : 2745 228 281 9 88.9 % 2384 22.2 % >>>>>>>2 Quest (DeepFritz) : 2550 266 169 9 66.7 % 2430 44.4 % >>>>>>>3 Chess Tiger 14.6 Gambit Tiger : 2499 291 229 9 55.6 % 2461 22.2 % >>>>>>>4 Crafty 18.10X : 2467 291 165 9 55.6 % 2428 44.4 % >>>>>>>5 Rebel : 2466 291 229 9 55.6 % 2428 22.2 % >>>>>>>6 Shredder : 2466 266 249 9 66.7 % 2346 22.2 % >>>>>>>7 Goliath : 2421 291 165 9 55.6 % 2382 44.4 % >>>>>>>8 Gromit 3.9.5 : 2364 278 201 9 61.1 % 2285 33.3 % >>>>>>>9 Ferret : 2359 291 229 9 55.6 % 2320 22.2 >>>>>>>%10 Gandalf 5.0 : 2310 291 229 9 55.6 % 2271 22.2 >>>>>>>% >>>>>>>11 ParSOS : 2256 291 229 9 55.6 % 2217 22.2 % >>>>>>>12 Diep : 2227 165 291 9 44.4 % 2265 44.4 % >>>>>>>13 IsiChess X : 2166 201 278 9 38.9 % 2245 33.3 % >>>>>>>14 Tao : 2165 229 291 9 44.4 % 2203 22.2 % >>>>>>>15 Ruy Lopez : 2118 366 266 9 33.3 % 2238 0.0 % >>>>>>>16 Pharaon : 2082 169 266 9 33.3 % 2202 44.4 % >>>>>>>17 SpiderGirl : 2014 213 255 9 27.8 % 2180 33.3 % >>>>>>>18 XiNiX : 1724 400 108 9 5.6 % 2216 11.1 % >>>>>>> >>>>>>>congratulation also for the Deep Junior team for winning the event convincingly >>>>>>>when the difference from the second place is almost 200 elo and the hardware >>>>>>>explain less than 70 elo difference. >>>>>>> >>>>>>>Uri >>>>>> >>>>>>I can add that I think that it may be a better idea to use elostat to decide >>>>>>about the world champion in the future. >>>>>> >>>>>>I know that a lot of people are going to disagree but it is my opinion. >>>>>>I prefer a complicated method that does more justive and not a simple method. >>>>>> >>>>>>Uri >>>>> >>>>> >>>>>Sorry Uri - but this is really nonsens. >>>>>You cant use ELO-Stat on a Swiss Tournament with 9 rounds as >>>>>it is described by the author. ELO-Stat is designed to calculate >>>>>ratings out of a pool of unknown rated progs with a very very lot >>>>>of games. >>>>>Therefor if you take a closer look at your table you would see that >>>>>the error margin is at least 435!pts (Pharaon) and max 632!! (RuyLopez). >>>>>And would you really believe Parallel SOS to be at 2256? :)) >>>> >>>>The question is not which program is better. >>>>competitions of 9 rounds are not supposed to answer this question. >>>> >>>>The question is which program did better result. >>>>The elostat answer this question better than the ranking >>> >>>You forget the tournament strategy. Many times, you can adjust the contempt >>>because you know that a draw is extremely convenient or will give you the >>>title right away. Not to mention the selection of more or less agressive opening >>>books for a special round. Sometimes, a draw is the same as a loss and you risk. >>>That throws away any significance of a performance ELO in a 9 round tournament. >>>This also applies for any human tournament. >>> >>>You can also have the weird situation where you got 8.5/9 and the one with 8/9 >>>has a better elo performance. They drew each other but a couple of opponents >>>that play the 1st started to crash many games aftewards because of late minute >>>changes in the code etc. That was totally out of control of the winner. >> >> >>I think that it is not logical >>If you get 8.5/9 your results are not worse than a player who got 8/9 and drew >>against you. >> >>We look for a stable rating >>Suppose that you got 8.5/9 >>Suppose that the rating of the player you drew is better than your rating >>I can prove that your rating is not stable and is going to get bigger after the >>tournament. >> >>you do not lose rating from winning 8 games and the rating of the opponents is >>not important. >>you win rating from drawing one game against a player with better elo rating so >>the total result is that you earn rating. >> >> >>If the elostat let situation when 8/9 is better than 8.5/9 including a draw >>between the 2 best players then something is wrong with the elostat program. >> >>Uri > >Hi Uri, >plese try the following experiment with elostat. >1. Players A, B, and C play each other, with the following individual results: >A beats B 99.5 to 0.5 >B beats C 99.5 to 0.5 >A beats C 100 to 0 >Which ratings do you get for A, B and C using Elostat? > >2. The same players, but with the following results: >A beats B 99.5 to 0.5 >B beats C 99.5 to 0.5 >Same question as for part 1. > >If the program behaves correctly, the rating of A for part 1 should not be lower >as the rating of A for part 2. >José. Unfortunately the program needs pgn and it calculate the results unless it is a competition by 2 players. Here is some information from the readme file of this program Following this theory, the Elo rating corresponding to a relative performance of 100 % or 0 % is indefinite. Due to mathematical reasons (e.g. to guarantee the feasibility of the iteration procedure) ELOStat assigns to those programs a finite Elo value which is exactly 600 points smaller (0 % perf.) or greater (100 % perf.) than the Av.Op. Elo. Or in other words: ELOStat does not support Elo differences greater than  600 points (therefore the 95% error margins can be at most  1200 points). For nearly all practical purposes, this restriction does not play an important role. In very rare cases ELOStat produces an error message stating that the iteration procedure failed and that no convergence of the Elo mean value could have been reached within the maximum number of iterations specified by the program. This problem only appears when many programs in the database are characterized by 0 % or 100 % results. In these cases the iteration procedure is slowed down significantly so that the Elo calculation takes a much longer time as usual. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.