Author: Uri Blass
Date: 00:52:19 08/08/03
Go up one level in this thread
On August 08, 2003 at 03:16:47, Günther Simon wrote: >On August 08, 2003 at 02:04:48, Axel Schumacher wrote: > >>Example: >>EloStat, some of the lower ranked engines: >>728 Yawce 0.16 : 1962 58 33 263 31.9 % 2094 6.8 >>% >>733 Raffaela : 1951 78 46 130 30.0 % 2098 12.3 >>% >>736 Nero 5.3 : 1934 114 60 81 30.2 % 2079 1.2 >>% >>755 Pierre 1.7 : 1861 60 30 290 30.2 % 2007 3.8 >>% >>773 ROBOKewlper 0.047 : 1778 143 55 71 15.5 % 2073 14.1 >>% >>775 Bigbook 3.1 : 1765 48 24 443 28.0 % 1929 9.5 >>% >>781 König Schwarz : 1717 53 42 182 36.0 % 1817 20.3 >>% >>787 Kace 0.8 : 1643 123 75 47 22.3 % 1860 23.4 >>% >> >>and the same with Fritz (even much higher values): >> >> Yawce 0.16 2080 262 >> Nero 5.3 2073 79 >> Raffaela 2064 130 >> Pierre 1.7 1983 288 >> Bigbook 3.1 1902 441 >> ROBOKewlper 0.047 1898 69 >> König Schwarz 1880 182 >> Kace 0.8 1805 47 >> >> >>Axel > >I dont know how your tournaments are structured, but you should >take care about having pools of players which are in a not to >distant range of Elo. >You should consider to make leagues or do some swiss tourneys. > >What would happen, if you won't calculate all games between >players which differ by more than 400 Elo? > >From the above ratings I can give you an example, why it >does not work as it should, even with EloStat. >I can see that Raffaela has a 30% score and a rating of 1951, >(in reality it is hardly over 1500) >imagine Raffaela had played 70x versus Fritz 8 and 30x >against Kace (assume it wins all versus Kace(what I doubt) >and loses all games versus Fritz8), it would get a highly >inflated rating, which would influence also all other >(in reality) weak opponents of Raffaela etc... > >Regards, >Günther I think that Elostat and Fritz should not be used for rating and it is better to have no rating and not to use these programs. The rating of a program should not be based on result and average of its opponents. There should be expected result for every rating difference and the way to calculate rating is to start with equal rating for everybody and in every step reduce the rating of programs that score less than expected and increase the rating of programs who do more than expected. The expected result should be calculated for every game and in the first iteration it is a draw. I think that a simple algorithm that says to reduce the rating of every program that scores more than expected by 1/(n^0.5) elo and increase the rating of every program that score more than expected by 1/(n^0.5) elo in iteration n is good enough if we do 100,000,000 iterations. There may be faster ways but it is not very important when computers are fast and can do millions of iteration in a short time(I first thought about 1/n but 1/n is not good enough because a program need near e^100 iteration to get 100 elo so I decided about the square root of n). Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.