Author: Kurt Utzinger
Date: 02:59:01 09/09/04
Go up one level in this thread
On September 09, 2004 at 04:59:21, Olivier Deville wrote:
>Too few games to draw conclusions though...
>
>Olivier
Hi Olivier
Correct conclusion in my opinion. As already
often posted here the following example of a
mtch [40'/40] I have played over 100 games between
Gandalf 4.32g and Program_X [I am a beta tester of X]:
Gandalf 4.32g vs Program X
Games 1-10
3.0-7.0 [win program X]
Total 3.0-7.0 for program X
Games 11-20
6.5-3.5 [win Gandalf]
Total 9.5-10.5 for program X
Games 21-30
5.0-5.0 [draw]
Total 14.5-15.5 for program X
Games 31-40
3.5-6.5 [win program X]
Total 18.0-22.0 for program X
Games 41-50
4.5-5.5 [win program X]
Total 22.5-27.5 for program X
Games 51-60
3.0-7.0 [win program X
Total 25.5-34.5 for program X
Games 61-70
5.0-5.0 [draw]
Total 30.5-39.5 for program X
Games 71-80
8.0-2.0 [win Gandalf]
Total 38.5-41.5 for program X
Games 81-90
7.0-3.0 [win Gandalf]
Total 45.5-44.5 for Gandalf
Games 91-100
5.5-4.5 [win Gandalf]
Final match result 51.0-49.0 for Gandalf
Can anybody tell me for sure which of the above two is the stronger program??
And what about if I had only played a 20 games match and these games would have
been those played in rounds 71-90? Then, the result would have been 15.0-5.0 in
favour of Gandalf 4.32g!! Imagine what some testers would have argued about the
strenght of program X?
For all these reasons I think that something concrete about the strength between
two programs can only be said if 100, better 200-300 games or even more have
been played.
Kurt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.