Computer Chess Club Archives


Search

Terms

Messages

Subject: Program comparison

Author: Shaun Brewer

Date: 07:06:33 03/03/99


I have been experimenting with openings and therefore played many games
attempting to determine if a certain book is better or not. As my PC is needed
for other tasks I have to interrupt the games and start again I then amalgamate
the results of several batches of games in an attempt to get something
statistically relevant.

Here are the example scores for one such set of batches, all played on the same
machine using the same program with books a and b constant for all batches.

a      b
26   -  35
 9.5 -  6.5
 7   - 15
58.5 - 54.5
39.5 - 45.5

I am rapidly coming to the conclusion that hundreds of games would be required
to be able to state that a is better than b, and this would also apply to
program v program tests.

It would also be very easy to stop a series of games at a point that backs a
particular argument.

What level of confidence can be attached to computer tournaments that the winner
is the best?

Is it true that computer v computer results vary more than human v human
results?



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.