Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: This sounds more logical :)

Author: Uri Blass

Date: 06:01:37 09/09/04

Go up one level in this thread


On September 09, 2004 at 05:59:01, Kurt Utzinger wrote:

>On September 09, 2004 at 04:59:21, Olivier Deville wrote:
>
>>Too few games to draw conclusions though...
>>
>>Olivier
>
>     Hi Olivier
>     Correct conclusion in my opinion. As already
>     often posted here the following example of a
>     mtch [40'/40] I have played over 100 games between
>     Gandalf 4.32g and Program_X [I am a beta tester of X]:
>
>
>Gandalf 4.32g vs Program X
>
>Games 1-10
>3.0-7.0 [win program X]
>Total 3.0-7.0 for program X
>
>Games 11-20
>6.5-3.5 [win Gandalf]
>Total 9.5-10.5 for program X
>
>Games 21-30
>5.0-5.0 [draw]
>Total 14.5-15.5 for program X
>
>Games 31-40
>3.5-6.5 [win program X]
>Total 18.0-22.0 for program X
>
>Games 41-50
>4.5-5.5 [win program X]
>Total 22.5-27.5 for program X
>
>Games 51-60
>3.0-7.0 [win program X
>Total 25.5-34.5 for program X
>
>Games 61-70
>5.0-5.0 [draw]
>Total 30.5-39.5 for program X
>
>Games 71-80
>8.0-2.0 [win Gandalf]
>Total 38.5-41.5 for program X
>
>Games 81-90
>7.0-3.0 [win Gandalf]
>Total 45.5-44.5 for Gandalf
>
>Games 91-100
>5.5-4.5 [win Gandalf]
>Final match result 51.0-49.0 for Gandalf
>
>Can anybody tell me for sure which of the above two is the stronger program??
>And what about if I had only played a 20 games match and these games would have
>been those played in rounds 71-90? Then, the result would have been 15.0-5.0 in
>favour of Gandalf 4.32g!! Imagine what some testers would have argued about the
>strenght of program X?
>
>For all these reasons I think that something concrete about the strength between
>two programs can only be said if 100, better 200-300 games or even more have
>been played.
>
>Kurt

Dependent on the result
You do not need 200-300 games
If you see 50-0 or even 48-2 between 2 programs.


You can get the conclusion that the winner is better unless
the winner repeat the same win again and again (it does not mean that the winner
has learning and it is even possible that both programs have no book and no
learning so in that case you cannot even get the conclusion that the winner has
better learning)

On the other hand if you see 151-149 after 300 games you cannot know which
program is better and it is possible that one is better but you need 30000 games
in order to know for sure which is better because the difference is very small.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.