Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Upon scientific truth - the nature of information

Author: Ralf Elvsén

Date: 17:32:08 07/15/00

Go up one level in this thread


On July 15, 2000 at 19:24:29, ShaktiFire wrote:

>On July 15, 2000 at 18:32:52, Mogens Larsen wrote:
>
>>On July 15, 2000 at 18:22:59, Ralf Elvsén wrote:
>>
>>>These are pretty harsh words, especially since I think Uri has a point.
>>>Even if it is not correct I wouldn't call it "nonsense" or "truth distortion".
>>>These judgements should be saved for more clear cases, and there has
>>>certainly been some on this board in the past...
>>
>>No, he doesn't have a point, since you can't determine GM strength by gathering
>>the results of several programs, reach GM strength within the bounds of
>>uncertainty and then conclude that one of the programs are GM strength. Because
>>you already know that none of programs alone are of GM strength with certainty
>>due to a large ELO uncertainty, otherwise it wouldn't be necessary to add them
>>together. So nonsense is the appropriate word, even though truth distortion was
>>unnecessary harsh.
>>
>>Best wishes...
>>Mogens
>
>That is an interesting point.  I wonder, do you know the elo formulation
>enough to say the uncertainty.  For example, Deep Jr. will achieve a TPR
>based on playing a 9 game tournament.  Now if we consider the TPR an
>estimate of the real elo rating, what is the uncertainty using only 9 games.
>How many games required to achieve say a 90% chance of having an elo
>rating within + - 25 pts. of the TPR?

Here is a method I'm using to get estimates I think are as good
as they can get without doing exact calculations.

I am counting the fraction of won points. To translate these to
elo differences you just need a table. You can find one at

http://www.uschess.org/ratings/info/system.html

If you now play a match between A and B and A wins a fraction p
of the points in N games and the fraction of draws is r, then the
elo difference can be found by looking up p in the table.

As for the uncertainty, a good estimate of the uncertainty in p is

s = squareroot( (p*(1 - p) - r/4)/N ) . (the standard deviation)

N shouldn't be too small but I don't know where this will
break down due to too few games.

Under certain reasonable assumptions the probability that the "true"
value of A:s scoring against B is between p + s and p - s is 68%.
The probability that it is between p + 2s and p - 2s is 95%.
(Here I assume a normal distribution, is this too sloppy?
I should look into this more carefully.)

Example: 20 games, A wins 8, draws 9 and loses 3.

p = (8 + 9*0.5)/20 = 0.625 .

Lookup in table: 90 rating points (or slightly below) in A:s advantage.

r = 9/20 = 0.45

s = squareroot( (0.625*(1 - 0.625) - 0.45/4)/20 ) = 0.078

So with 68% probability the "true" score for A is between
0.625 - 0.078 = 0.547 and 0.625 + 0.078 = 0.702 .
What 0.547 and 0.702 means in terms of rating points can once again
be found in the table.

Etc... *yawn*

The probability that the above is correct is pretty small since
I should be soundly asleep now for a long time. I am writing this
not only from altruism but I hope any flaws in my system will be
pointed out.

Good night

Ralf






This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.