Author: Axel Schumacher
Date: 08:09:43 08/08/03
Go up one level in this thread
On August 08, 2003 at 03:16:47, Günther Simon wrote: >On August 08, 2003 at 02:04:48, Axel Schumacher wrote: > >>Example: >>EloStat, some of the lower ranked engines: >>728 Yawce 0.16 : 1962 58 33 263 31.9 % 2094 6.8 >>% >>733 Raffaela : 1951 78 46 130 30.0 % 2098 12.3 >>% >>736 Nero 5.3 : 1934 114 60 81 30.2 % 2079 1.2 >>% >>755 Pierre 1.7 : 1861 60 30 290 30.2 % 2007 3.8 >>% >>773 ROBOKewlper 0.047 : 1778 143 55 71 15.5 % 2073 14.1 >>% >>775 Bigbook 3.1 : 1765 48 24 443 28.0 % 1929 9.5 >>% >>781 König Schwarz : 1717 53 42 182 36.0 % 1817 20.3 >>% >>787 Kace 0.8 : 1643 123 75 47 22.3 % 1860 23.4 >>% >> >>and the same with Fritz (even much higher values): >> >> Yawce 0.16 2080 262 >> Nero 5.3 2073 79 >> Raffaela 2064 130 >> Pierre 1.7 1983 288 >> Bigbook 3.1 1902 441 >> ROBOKewlper 0.047 1898 69 >> König Schwarz 1880 182 >> Kace 0.8 1805 47 >> >> >>Axel > >I dont know how your tournaments are structured, but you should >take care about having pools of players which are in a not to >distant range of Elo. >You should consider to make leagues or do some swiss tourneys. > >What would happen, if you won't calculate all games between >players which differ by more than 400 Elo? > >From the above ratings I can give you an example, why it >does not work as it should, even with EloStat. >I can see that Raffaela has a 30% score and a rating of 1951, >(in reality it is hardly over 1500) >imagine Raffaela had played 70x versus Fritz 8 and 30x >against Kace (assume it wins all versus Kace(what I doubt) >and loses all games versus Fritz8), it would get a highly >inflated rating, which would influence also all other >(in reality) weak opponents of Raffaela etc... You're right. However, Raffaela certainly didn't played against Fritz more than once. Usually, after a gauntlet against some engines, in my tourney each new engine plays most games against other engines which Elo is in the same range (+/- 100; mostly small sub-swiss tourneys). I agree with Uri, that we may need a new way to calculate these data. Maybe we also should not try to compare the absolute Elo-values with Elo-rankings we know from human games, unless a substantial part of the computergames were played against humans. It seems I have to pay more games against theses engines :-) Regards Axel > >Regards, >Günther
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.