Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Fritz is not champ, the match result is under the margin of error

Author: blass uri

Date: 01:36:22 07/29/00

Go up one level in this thread


On July 29, 2000 at 03:50:59, Terry Ripple wrote:

>On July 28, 2000 at 15:45:15, Christophe Theron wrote:
>
>>On July 28, 2000 at 01:05:53, Terry Ripple wrote:
>>
>>>Used an AMD K6-2, 266Mhz, 64Ram, Ponder off, 16Mb Hash per engine.
>>>
>>>If anyone cares to see some or all of the games, i will be glad to post them.
>>> This match shows how close the strengths are between these two fine engines!
>>>
>>>Best regards,
>>>Terry
>>>
>>>Blitz:5'  2000
>>>
>>>
>>>1   Fritz 6      158.0/306
>>>2   Hiarcs 7.32  148.0/306
>>
>>
>>
>>No offense intended Terry, but you cannot say with this match which program is
>>the best.
>>
>>The result of this match is 51.63% in favor of Fritz.
>>
>>I don't have the typical margin of error for 306 games, but I know that for 400
>>games it is +/-2.5% (80% confidence) and +/-2.1% (70% confidence).
>>
>>So even if you got this 51.63% with a 400 games match, you couldn't say which
>>program won because 51.63% is between 47.5% and 52.5% (80% confidence). You
>>couldn't even say Fritz is better with 70% confidence.
>>
>>That's the problem with chess matches results... You have to apply some
>>statistic formulas and sometimes you discover that the match does not say which
>>is best...
>>
>>
>>
>>    Christophe
>
> Please explain where you may get a margin of error when there isn't a human
>operator making any moves on the chess board? Please, i would like to learn more
>about this!
>
>Regards, Terry

You can get the margin of error by the following experiment:

Throw a coin 400 times and count the number of heads.
repeat the experiment again and again.
The best guess is 200 heads in every experiment.

In 80% of the cases this guess will be wrong by not more than 2.5%
and in 70% of the cases you will be wrong by not more than 2.1%


This is not correct for chess because there are draws so the coin should have 3
results.




The following explanation may not be clear to you if you did not learn
statistics but I will give it.
Let assume that the probability for white to win is 40%,the probability for a
draw is 30% and the probability for black to win is 30%.
The variance in one game is
0.4*0.45*0.45+0.3*0.05*0.05+0.3*0.55*0.55=0.081+0.00075+0.09075=0.1725
The variance in 400 games(assuming the events are independent) is 0.1725*400=69.

It gives standard deviation of about 8.3 that is 2.075%.
The probability to be in a distance of one standard deviation from the right
number assuming normal distribution is about 80%(I think 83% but I am not sure).

binomical distibution with 400 games is close to be normal distribution so you
get +-2.1% with about 80% confidence.
I got different result relative to christhophe results probably because my model
is more comlicated but 1.63% is not significant even if you have 400 games

I should look at the  right tables to see the probability to be at distance of
1.63/2.1~=0.78 standard deviations and I have not the right tables near me.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.