Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: ELO performance?

Author: Peter Fendrich

Date: 16:31:10 05/23/99

Go up one level in this thread


On May 23, 1999 at 07:55:59, Stefan Meyer-Kahlen wrote:

>On May 22, 1999 at 14:22:26, Peter Fendrich wrote:
>
>>Sorry, I forgot to answear the 95% question... :)
>>
>>To estimate the standard deviation of the Performance Rating there are
>>lots of options, giving about the same result. You need the individual results
>>to know the the error margin and that's what this is all about. There is a
>>difference between a result of only wins and losses compared to a result with a
>>lot of draws. The SSDF method is best described as:
>>       (1) s=SQRT(W(1-m)**2 + D(0.5-m)**2 + L(0-m)**2/(n-1))
>>       (2) A=1.96 * s/SQRT(n)
>>
>>       where s is estimated standard deviation
>>             n     is number of games
>>             m     is score/n
>>             W,D,L is number of Wins, Draws and Losts respectivelly
>>             A     is the margin of error (for score, not rating points)
>>             1.96  is fethed from the Normal Distribution table to get
>>                   95% reliability
>>             SQRT  is the square root
>>             **    is used as 'the power of'
>>
>>    Now we have an 95% interval of *scores* from m-A to m+A
>>
>>    Compute the ratings for each m-A and m+A and here we go!
>>    These ratings are the end points in the interval.
>>
>>//Peter
>
>
>Yes, that's what I was looking for, thanks.
>I think (1) should be
>
>  (1) s=SQRT((W(1-m)**2 + D(0.5-m)**2 + L(0-m)**2)/(n-1))
>
>right?
>
>
>Let's assume A and B play 3 games, A wins 2, 1 is a draw.
>If I use your fromula I get
>   S = 0.2886
>and
>   A = 0,3266
>
>If a calculate m-A I get 0.5067, which is > 0.5!
>There must be somethin wrong, because I don't think you can say that A is better
>than B with 95% prob. just after 3 games.
>
>Stefan

Your correction of (1) is right!

About the 3 games match: confusing isn't it?
The formulas given are based on the Normal distribution but the population (all
the results) isn't Normally distributed by itself.
We are leaning against a theorem telling us that we can approximate the true
distribution by the Normal one when the number of results are as many as needed.
How many results is needed in order to apply the formulas then? Well, there are
no fix numbers but one usually say somewhere between 20-30. With 50 and more
games you can be quite confident of the calculated margins but with very low
number of games, below 10 or so you shouldn't even try to use the formulas.
When you are using the formulas you should know that there are two sources of
errors. One from the distribution covered by the formulas and one from the fact
that we are only approximating the Normal Distribution. The second source of
errors are not covered by the formulas given here.

With 3 games, whatever the result is, we don't know a shit ... :)
Hope this make sense...

Another thing: Don't use the formulas with only one type of result (all wins or
all draws).

//Peter



This page took 0.04 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.