Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Spike 1.0 Mainz is too strong for Zappa 1.1 so far 16 to 10

Author: Uri Blass

Date: 22:34:17 08/29/05

Go up one level in this thread


On August 29, 2005 at 18:51:53, Dann Corbit wrote:

>On August 29, 2005 at 18:21:33, Peter Berger wrote:
>
>>On August 29, 2005 at 10:40:54, Kurt Utzinger wrote:
>>
>>>On August 29, 2005 at 06:36:42, Jorge Pichard wrote:
>>>
>>>>   Engine  Score
>>>>1: Spike10 16/26  1=010==1===110=110110=11==
>>>>2: Zappa   10/26  0=101==0===001=001001=00== ···············
>>>
>>>      After only 26 games and a winning score of 61 %
>>>      it's too early for such a statement I think.
>>>      Kurt
>>
>>That's true. But only barely.
>>
>>Assuming that everything is set up properly, games are independent events ( aka
>>no learning) and that white and black have same likeliness to win (just for sake
>>of correctness, I am actually pretty sure this doesn't make a major difference),
>>
>>the result is good enough to claim that Spike is better with 90% confidence. And
>>only one more win in the following game would have been enough for 95%
>>confidence in fact ;) .
>>
>>How do you feel about this one?
>>
>>A 1 1 1 1 1
>>B 0 0 0 0 0
>>
>>More games needed? Not if you can live with 97% confidence .
>
>Of course, if we recall the Cadaques tournament of some years ago, it stated as
>a whitewash for Junior, but Junior eventually lost (possibly due to learning so
>your statement above may apply).
>
>>Hmm, let's go back to the imagined 17/27 from Spike. We need more games?
>>
>>OK. Let's look at this result:
>>
>>Wins: 12
>>Loss: 5
>>Draws: 100000
>>
>>Better? Worse? No, the same.
>
>I don't put much credence in any result of less than 30 games.
>After 30 games, then you get a lot more plausibility.

I disagree.

If you see 20-0 you do not need to wait for 30 games to be practically sure that
the winner is better.

On the other hand I saw result of 18-12 in nunn match that was later changed to
less than 50% for the side that scored 18-12.

I think that it may be better to continue until the difference in result is 10
points and stop there.

The idea of fixed number of games is bad.
The point is that if the difference is big you do not want to waste too much
time on testing and if the difference is small it is less important not to do
errors in deciding which version is stronger.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.