Author: blass uri
Date: 15:37:18 07/28/00
Go up one level in this thread
On July 28, 2000 at 18:14:13, Dann Corbit wrote: >On July 28, 2000 at 15:58:53, blass uri wrote: > >>On July 28, 2000 at 15:33:47, Dann Corbit wrote: >><snipped> >>> Program Elo + - Games Score Av.Op. Draws >>> >>> 1 LarsenVB : 2610 186 226 12 79.2 % 2378 25.0 % >>> 2 Storm : 2557 223 166 12 58.3 % 2498 33.3 % >>> 3 Noonian : 2546 232 144 12 54.2 % 2517 41.7 % >>> 4 Ozwald : 2542 166 223 12 41.7 % 2601 33.3 % >>> 5 Monik : 2363 214 247 12 62.5 % 2274 8.3 % >>> 6 Zephyr : 2317 215 215 12 50.0 % 2317 16.7 % >>> 7 TSCP : 2293 180 402 12 83.3 % 2013 0.0 % >>> 8 SnailSCP : 2185 214 194 12 62.5 % 2096 25.0 % >>> 9 Raffaela : 1893 297 170 12 8.3 % 2310 16.7 % >>> 10 Golem01 : 1695 0 0 12 0.0 % 2295 0.0 % >> >>The elo is simply wrong. > >The ELO is approximate, and correct within the stated error bars. > >>The right way to calculate elo based on tournament is simply to assume that the >>tournament happen again and again and calculate the limit of the elo rating for >>every program when the number of games get closer to infinite > >Naturally, this is the process that was used. About 100 iterations, if I recall >correctly, is what was used to calculate the table. I guess that 100 iterations is not enough and I guess that after 100000 iterations TSCP is going to have better rating than Monik(it is clear that the rating of Monik cannot be better than TSCP after enough iterations). calculating 100000 iterations is not a problem for the computer. > >>(you should not >>include programs that has 0% or 100%). > >What is your scientific reason for exclusion of real data? It is just as valid >to win or lose all of your games as it is to win only a fraction of them. The problem is that if one program has -infinite rating (0% is the result of -infinite rating if the number of iteration is infinite) the other programs have +infinite rating and we cannot have order between the other programs. > >>In this case you should not count Golem01 because the elo of this program will >>always go down(the expected result of golem is more than 0% even if there is a >>difference of 1000000 elo). > >Golem will eventually win or draw. I expect Golem01 verses Raffaela to be about >even. When it happens you can include Golem in the rating assuming that it is impossible to divide the programs to 2 classes when one class got 100% against the other class. > >>TSCP also deserves at least the same rating as Monik because TSCP got 50% >>against Monik and 100% against other players so the rating of TSCP should always >>improve when it is behind Monik when the results repeat again. > >TSCP has played weaker opponents than Monik, according to the calculations. >Eventually, the error bars will reduce and I expect that in the end, each ELO >positional value will agree with the ordinary contest ranking positional value >(points scored, and tie-breaks). It is clear that in the end the elo will be in the same order as the ranking but when you have only part of the data the calculation of the rating is wrong. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.