Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: How many games are needed to find out which program is stronger?

Author: Terry Ripple

Date: 00:30:10 09/03/99

On September 03, 1999 at 03:16:50, Bruce Moreland wrote:

>On September 02, 1999 at 20:02:55, Heiko Mikala wrote:
>
>>And I say what you're saying is clearly wrong. Believe me, I learned this the
>>hard way during the last ten years of work on my own chess program. I often had
>>the case that in a first test match of about 30-40 games my program convincingly
>>won a match, than let it play another, longer match overnight and during the
>>next day, which it than lost. You always need the same amount of games, no
>>matter how the score is after a first, short match. My experience after hundreds
>>of test matches shows, that you need at least 70-80 games to be able to come to
>>a conclusion. And you need some hundred games to be sure. Even if the first 15
>>games end in an 15-0 score. Because the next 15 games may end 0-15. This is a
>>frustrating fact, but it is *a fact*. It's frustrating, because for us as
>>programmers it means, that we have to do much more time consuming testing than
>>we would like to do.
>
>It shouldn't work like this.  You can't take a selection from somewhere in the
>middle of a long run of games, and use that to prove anything, but if you start
>out and play some games, and one program wins a several games in a row, you
>should be able to make a safe conclusion.
>
>I would really like to understand the 15-0 and 0-15 situation.  That should
>*not* happen.  That's not how math should work.  If you flip a coin 15 times and
>it comes up heads each time, the odds of this happening completely by chance are
>extremely small.  The odds that it would then come up tails 15 times in a row
>are also extremely small, and combined they should be vanishingly small.
>
>You can find a run where this happens, with two equal strength programs, but it
>should have to be an extremely large run.
>
>Maybe there is something going on that destroys the randomness of the whole
>thing -- for instance it could be a problem involving a narrow book.
>
>bruce
---------

Bruce, i think your on to something! The book learning sometimes creates a very
narrow book. This happened to me when using it for alot of blitz games in engine
vrs. engine matches and so i don't use book learning for blitz anymore!!!

Terry

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.