Computer Chess Club Archives




Subject: Re: A question about statistics...

Author: Peter Berger

Date: 09:47:25 01/04/04

Go up one level in this thread

On January 04, 2004 at 12:40:00, Ricardo Gibert wrote:

>On January 04, 2004 at 12:29:15, Mark Young wrote:
>>On January 04, 2004 at 11:46:00, Roger Brown wrote:
>>>Hello all,
>>>I have read numerous posts about the validity - or lack thereof actually - of
>>>short matches between and among chess engines.  The arguments of those who say
>>>that such matches are meaningless (Kurt Utzinger, Christopher Theron, Robert
>>>Hyatt et al)typically indicate that well over 200 games are requires to make any
>>>sort of statisticdal statement that engine X is better than engine Y.
>>>I concede this point.
>>If you concede this point you don't understand. There is no magic number like
>>200 or 2000. The score must be considered. Here is an example:
>>A score of 17 - 3 in a 20 game match has a certainty of over 99% that the winner
>>of the match is stronger then the loser.
>>A 100 game match ending 55 - 45 only has a 81% chance that the winner of the
>>match is the stronger program.
>>A 200 game match ending 106 - 94 only has a 78 % chance that the winner is
>>stronger then the loser.
>Nothing you have said is really correct because you have ignored the significant
>effect of draws in a match.

The percentage of draws doesn't matter at all when it is about the conclusion
which program is strongest based on the above match results.

This has been shown by Remi Coloum and explained in multiple posts
here(unfortunately the search engine hasn't found a new home yet).

6-0 with 0 draws and 6-0 with 1000 draws has the exact same prediction value
when it is about the question which engine is stronger based on a match result.


This page took 0.04 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.