Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How many games are needed to find out which program is stronger?

Author: Harald Faber

Date: 00:51:05 09/03/99

Go up one level in this thread


On September 03, 1999 at 03:16:50, Bruce Moreland wrote:

>On September 02, 1999 at 20:02:55, Heiko Mikala wrote:
>
>>And I say what you're saying is clearly wrong. Believe me, I learned this the
>>hard way during the last ten years of work on my own chess program. I often had
>>the case that in a first test match of about 30-40 games my program convincingly
>>won a match, than let it play another, longer match overnight and during the
>>next day, which it than lost. You always need the same amount of games, no
>>matter how the score is after a first, short match. My experience after hundreds
>>of test matches shows, that you need at least 70-80 games to be able to come to
>>a conclusion. And you need some hundred games to be sure. Even if the first 15
>>games end in an 15-0 score. Because the next 15 games may end 0-15. This is a
>>frustrating fact, but it is *a fact*. It's frustrating, because for us as
>>programmers it means, that we have to do much more time consuming testing than
>>we would like to do.
>
>It shouldn't work like this.  You can't take a selection from somewhere in the
>middle of a long run of games, and use that to prove anything, but if you start
>out and play some games, and one program wins a several games in a row, you
>should be able to make a safe conclusion.
>
>I would really like to understand the 15-0 and 0-15 situation.  That should
>*not* happen.  That's not how math should work.  If you flip a coin 15 times and
>it comes up heads each time, the odds of this happening completely by chance are
>extremely small.  The odds that it would then come up tails 15 times in a row
>are also extremely small, and combined they should be vanishingly small.

Maybe Heiko exaggerated a bit but many people have seen 9-1, 8.5-1.5 or s.th.
like these results in a row where the outcome was much different, more equal...

That was the point.

>bruce



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.