Computer Chess Club Archives


Search

Terms

Messages

Subject: Some statistics

Author: Maurizio De Leo

Date: 16:48:18 09/28/05

Go up one level in this thread


First of all, I remember the CEGT number as of now.

Fritz 9 ---> 2764 +- 33
Fritz 8 ---> 2712 +- 12

I slightly changed the elo values in the quoted part to match the current values
and avoid confusion.

George wrote:
>> 1)With 95% probability Fritz9's rating is at least 2731
>> 2)With 95% probability Fritz8 Bilbao's rating is at most 2724

This is not completely true. First of all, if they used 95% confidence, the
correct statements are

1)With 97.5% probability Fritz9's rating is at least 2731
2)With 97.5% probability Fritz8 Bilbao's rating is at most 2724

Because the 5% refers to both tails of the bell, while here we are considering
only one of the two.

Moreover we can't conclude that

>> There are 95% chances that Fritz9 is better than Fritz8 Bilbao.
>> But still 5% that Fritz8 Bilbao is better than Fritz9..........

You have to consider the distribution of the difference in rating. It's mean
value is the difference of the mean ratings of the two programs. At the present
moment it is

Mean diff. = 2764 - 2712 = 52

The standard deviation, if I remember correctly, is the square root of the sum
of the two standard deviations. Assuming 95% confidence:

sigma1 = 33/2 = 16.5
sigma2 = 12/2 =  6

sigma diff. =  sqrt ( 16.5^2 + 6^2 ) = 17.55

So the value "ZERO" is about THREE standard deviation away from the mean of the
difference, which corresponds to a probability of 1%. Moreover, given the topic
discussed before the probability of Fritz 8 being stronger than Fritz 9 is only
0.5%. The other 0.5% is the possibility of Fritz 9 being more than 104
(i.e.52+3*17.55) points stronger than Fritz 8.

In conclusion, I think that the GECT data strongly support the fact that Fritz 9
is stronger than Fritz 8.

Maurizio



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.