Author: Ralf Elvsén
Date: 17:32:08 07/15/00
Go up one level in this thread
On July 15, 2000 at 19:24:29, ShaktiFire wrote: >On July 15, 2000 at 18:32:52, Mogens Larsen wrote: > >>On July 15, 2000 at 18:22:59, Ralf Elvsén wrote: >> >>>These are pretty harsh words, especially since I think Uri has a point. >>>Even if it is not correct I wouldn't call it "nonsense" or "truth distortion". >>>These judgements should be saved for more clear cases, and there has >>>certainly been some on this board in the past... >> >>No, he doesn't have a point, since you can't determine GM strength by gathering >>the results of several programs, reach GM strength within the bounds of >>uncertainty and then conclude that one of the programs are GM strength. Because >>you already know that none of programs alone are of GM strength with certainty >>due to a large ELO uncertainty, otherwise it wouldn't be necessary to add them >>together. So nonsense is the appropriate word, even though truth distortion was >>unnecessary harsh. >> >>Best wishes... >>Mogens > >That is an interesting point. I wonder, do you know the elo formulation >enough to say the uncertainty. For example, Deep Jr. will achieve a TPR >based on playing a 9 game tournament. Now if we consider the TPR an >estimate of the real elo rating, what is the uncertainty using only 9 games. >How many games required to achieve say a 90% chance of having an elo >rating within + - 25 pts. of the TPR? Here is a method I'm using to get estimates I think are as good as they can get without doing exact calculations. I am counting the fraction of won points. To translate these to elo differences you just need a table. You can find one at http://www.uschess.org/ratings/info/system.html If you now play a match between A and B and A wins a fraction p of the points in N games and the fraction of draws is r, then the elo difference can be found by looking up p in the table. As for the uncertainty, a good estimate of the uncertainty in p is s = squareroot( (p*(1 - p) - r/4)/N ) . (the standard deviation) N shouldn't be too small but I don't know where this will break down due to too few games. Under certain reasonable assumptions the probability that the "true" value of A:s scoring against B is between p + s and p - s is 68%. The probability that it is between p + 2s and p - 2s is 95%. (Here I assume a normal distribution, is this too sloppy? I should look into this more carefully.) Example: 20 games, A wins 8, draws 9 and loses 3. p = (8 + 9*0.5)/20 = 0.625 . Lookup in table: 90 rating points (or slightly below) in A:s advantage. r = 9/20 = 0.45 s = squareroot( (0.625*(1 - 0.625) - 0.45/4)/20 ) = 0.078 So with 68% probability the "true" score for A is between 0.625 - 0.078 = 0.547 and 0.625 + 0.078 = 0.702 . What 0.547 and 0.702 means in terms of rating points can once again be found in the table. Etc... *yawn* The probability that the above is correct is pretty small since I should be soundly asleep now for a long time. I am writing this not only from altruism but I hope any flaws in my system will be pointed out. Good night Ralf
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.