Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Extensions?!

Author: Bruce Moreland
Date: 11:49:49 01/13/98

On January 13, 1998 at 14:08:00, Dan Homan wrote:

>Great suggestion Bruce, thanks.
>
>The hardest part is evaluating the changes I make, and these tools you
>suggest sound like a very good idea.  My technique for evaluating
>changes up to this point has been either a) run WAC by hand noting any
>obvious changes in solution times/scores, but this is difficult to
>do precisely or b) let the program play on FICS for a while and
>observe the games.  Any technique is flawed but yours sound like it
>at least provides quantitative (and diverse) information on which to
>make a decision.

It is a good idea to automate your test suite thing.  My program is
passed test suite name and time information via the command line, and
when it finishes a suite it quits, which lets me run the program several
times via a batch file.

I number the output files according to which version created the file,
and I number the executables too, so I can easily figure out what I've
already done, and I can do something new, even with a very old version,
very conveniently.

I'm just making use of a few semi-truths:

1) If two programs search the same tree and produce the same result, the
one that does it faster is stronger.

2) If you are trying for a strength increase via point #1, but you
notice that the tree has changed as a result, it is likely that a bug is
involved.

3) If you make a change to extensions or pruning, it's worth checking
against a large and diverse tactical suite, as well as checking to see
what effect your change has on tree size in positional cases.

There are cases that are hard to test.  If I increase my doubled pawn
penalty, I'll run suites in order to establish a new baseline, for
comparison later, but it doesn't really matter whether it solves a few
more or less on a big tactical suite, or whether the program gets to
depth D a little faster or slower.  This change isn't going to have this
kind of impact legitimately, so what I'm seeing is probably just noise.
If there is a *huge* impact from a simple change, it's worth
investigating though, since you may have completely wrecked something.

I don't know how to verify small changes.  I use the qualitative method,
like you do (watching games on a server), but this is flawed too.  You
can get a mistaken impression easily, I think.

But I think it's almost useless to watch for rating point changes when
you make a small (or even large) change, since the ratings on the chess
servers are pretty random.  I've seen more than 200 point variance with
the same version if you leave it on for a few days, and it is hard to
imagine that a change in a chess program, unless it introduced or fixed
a really massive bug, could be detected against that.

bruce
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.