Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Some stats...

Author: Richard Pijl

Date: 07:13:32 01/23/04

Go up one level in this thread


On January 22, 2004 at 21:02:19, Rolf Tueschen wrote:

>On January 22, 2004 at 20:15:14, Rolf Tueschen wrote:
>
>>On January 22, 2004 at 12:53:16, Christophe Theron wrote:
>>
>>>On January 21, 2004 at 20:00:12, Kolss wrote:
>>>
>>>>Hi,
>>>>
>>>>How many games you need depends on what you want to show, of course... :-)
>>>>If my calculations are correct, I get the following:
>>>>
>>>>Shredder 8 vs. Shredder 7.04:
>>>>
>>>>+90 -65 =145
>>>>
>>>>=> 162.5 - 137.5
>>>>
>>>>=> 54.17 %
>>>>
>>>>=>
>>>>Elo difference = +29
>>>>95 % confidence interval: [+1, +58]
>>>>
>>>>That means that based on this 300-game match (for this particular time control
>>>>on this particular computer with these particular settings etc.), your best
>>>>guess is that S8 is 29 Elo points better than S7.04 (highest likelihood for that
>>>>value); there is a 95 % chance that S8 is between 1 and 58 Elo points better;
>>>>and the likelihood that S8 is (at least 1 Elo point) better than S7.04 is 97.5
>>>>%.
>>
>>
>>This is wrong. Stats doesn't work this way. In your example above 1 Elo is as
>>probable as 58 Elo. There is no way to hypostate that Elo 29 is the "best"
>>guess. With a defined confidence int. of 95% you get a variance of 1 to 58 Elo
>>points. Then you look how your results are differing for two progs. All results
>>between 1 and 58 tell you nothing about differences! You still have to admit
>>that the two progs could be equally strong. You need at least Elo +-59
>
>[correction: you need simply 59 for the difference between progs] for a
>>claim of being better.

What is estimated above using statistical methods is the difference in ELO
between Shredder 8 and Shredder 7.04. The difference is estimated to be +29,
where the confidence interval (95%) of the difference is +1 - +58. This means
that with the probability of 97.5% Shredder 8 is stronger by at least 1 ELO
point.
What do you not understand here?

Richard.


>
>>- NB you propose that the two progs are equally
>>strong and then you test against it. You must top 58. [all this on the base of a
>>specific N of games, the results calculated in Elo; I didn't follow the debate
>>but normally you calculate with scores from the games/matches just for
>>mentioning it]
>>
>>Rolf
>>
>>
>>>>
>>>>So if you "only" want to show that S8 is better, you can - statistically
>>>>speaking - stop now. If you want to "prove" that it is more than 20 Elo points
>>>>better, you need a few more games indeed...
>>>>
>>>>Best regards - Munjong.
>>>
>>>
>>>
>>>It's great to see that at least one guy is able to correctly interpret match
>>>results here.
>>>
>>>I hope you will post more often on this subject. Information on it is very much
>>>needed here.
>>>
>>>
>>>
>>>    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.