Author: Dann Corbit
Date: 17:50:49 06/10/02
Go up one level in this thread
It will take several hundred games to get a good feel for the relative strength
of two programs.
If the strength is highly similar, then thousands of games if you only play them
against each other.
Consider this brief match:
Program Elo + - Games Score Av.Op. Draws
1 Chop1088 : 2458 44 35 231 52.2 % 2442 29.0 %
2 KnightDreamer v3.0 : 2442 35 44 231 47.8 % 2458 29.0 %
Individual statistics:
(1) Chop1088 : 231 (+ 87,= 67,- 77), 52.2 %
KnightDreamer v3.0 : 231 (+ 87,= 67,- 77), 52.2 %
(2) KnightDreamer v3.0 : 231 (+ 77,= 67,- 87), 47.8 %
Chop1088 : 231 (+ 77,= 67,- 87), 47.8 %
With 231 games played, the ELO estimate differs by 16 points, but the error bar
is far greater than that. While it is more likely that Lambchop is stronger
than KnightDreamer, we cannot offer proof of this assertion from the data.
And supposing that we had so many games played that the error bars were smaller
than the ELO difference...
Then what would have been demonstrated is:
1. Under the Winboard interface version 4.2.5 using switch /mg
2. On a 950 MHz AMD Athlon with 512 megs ram
3. At game in ten minutes time control
4. With the following custom book for KnightDreamer:
06/07/2002 03:39p 1,018,540 KD3_book.bin
and the following Winboard initialization string:
"KnightD3" /fd="e:\programme\winboard\KnightD3"
5. With the following standard books for LambChop:
02/13/2000 05:03p 1,048,576 CHOPBIGB.BIN
02/12/2001 10:40p 98,304 CHOPBOOK.BIN
and the following Winboard initialization string:
"Chop1088 -xboard -hash 40" /fd="e:\programme\winboard\LambChop"
Then Lambchop appears to be stronger.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.