Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: About 100, except...

Author: Bruce Moreland

Date: 22:49:06 09/03/99

Go up one level in this thread


On September 03, 1999 at 03:47:55, Harald Faber wrote:

>
>>And I say what you're saying is clearly wrong. Believe me, I learned this the
>>hard way during the last ten years of work on my own chess program. I often had
>>the case that in a first test match of about 30-40 games my program convincingly
>>won a match, than let it play another, longer match overnight and during the
>>next day, which it than lost. You always need the same amount of games, no
>>matter how the score is after a first, short match. My experience after hundreds
>>of test matches shows, that you need at least 70-80 games to be able to come to
>>a conclusion. And you need some hundred games to be sure. Even if the first 15
>>games end in an 15-0 score. Because the next 15 games may end 0-15. This is a
>>frustrating fact, but it is *a fact*.
>>
>>Heiko.
>
>Except you are Torsten Czub and KNOW from 1 or 2 games which program is
>better/stronger by only looking at the moves and the evaluations during the
>game. :-)))

You can play a million games and not have even the slightest guess which is
stronger if the score is 500,000 to 500,000, although of course you've shown
very conclusively that there is not much difference.

If you play a million games it should be possible to know that one is stronger
if it wins only a few percent more, perhaps less.

If you play a shorter match, it should be possible to know that one is stronger
if the score is more lop-sided.

And if you play a very short match, it should be possible to know that one is
stronger if the score is very extreme.

The odds of a mistaken assessment in a very short match with an extreme result
should be no worse than if you do a longer match that has a sufficiently close
result.  I think it likely that people tend to underestimate how extreme the
result of a longer match has to be, in order to be conclusive.

Determining which engine is stronger should be simply a matter of getting a
result that is hard to explain via chance.

I wrote another post where I go off on this in more detail, although still less
than I'd like to be competent to.

bruce




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.