Computer Chess Club Archives




Subject: Re: Value of playing different versions of a program against each other

Author: Ulrich Tuerke

Date: 02:25:09 01/07/03

Go up one level in this thread

On January 06, 2003 at 16:56:35, Tom King wrote:

>Hi all,
>What do people think about playing different versions of your program against
>each other as a way of testing?
>I'm playing around with it right now, between v0.07 and a newer version of my
>program. The newer version is winning handsomely: +24,=18,-10.
>This implies a reasonably impressive increase in strength, almost 100 ELO. Ok,
>ok, it's a small sample, so the margin of error could be big.
>However, my gut feel is that playing different versions of your programs tends
>to overstate the strength differences. What do people think?

Hi Tom,

my "improvements" usually look much more modest. (Hopefully, the aren't
an illusion. -:) )
IMHO, it's very difficult to evaluate the result of a program change.
I think that matches against previous versions follow its "own rules".
Nevertheless, I think that they must be done in order to exclude major flaws.
I don't think that differences will be exaggerated this way.

However, I have observed that you have to play really LOTS of games in order to
find out which version is doing better. It seems, I'm not as luck as you are,
because often the weaker version will lead in the 1st 20 games and ubsequently -
for me often quite unexpected, just before I'm going to throw the towel - the
stronger version can make it.

I personally like to use the Fritz-GUI in order to test different versions. Both
get the same, small book. The GUI is so clever, to replay every game with
opponents exchanged. So, randomness through book is kind of minimized.


This page took 0.02 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.