Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Test position software... whats out there already?

Author: Dave Gomboc
Date: 22:45:33 04/27/04
On April 27, 2004 at 18:06:17, Dann Corbit wrote:

>On April 27, 2004 at 17:57:15, Eric Oldre wrote:
>
>>My engine (murderhole) has gotten to the point where it's not easy for me to
>>tell if a given change has helped or hurt more. So I need to come up with a
>>better, more quantitative way of testing.
>>
>>Of course I know that the only sure test is lots and lots of games. but i just
>>don't have the patience, and the results can vary so much.
>>
>>my idea, and i'm sure you all have thought of something similar, but probably
>>better (that's why i'm posting) is:
>>
>>1) create a series of test positions probably small at first (50-100) but would
>>need to grow later.
>>
>>2) generate a list of all possible moves from each test position and the
>>resulting position after the move.
>>
>>3) have some strong program generate scores for each resulting position, and
>>therefore a score for the preceeding move.
>>
>>4) then i could run my program against each position and see how often it picked
>>the best, 2nd best, 3rd best, etc.
>>
>>I could store all the test positions and scores in an XML file perhaps.
>>
>>The only problem is that setting this up would be a pain and spare time is not
>>something I have lots of these days (like all of us i'm sure) so if i can avoid
>>some work i'm all for it.. I was hoping someone might have some files for
>>configuring this stuff publicly available, or maybe it is even a feature of some
>>commercial program that i don't know of.
>>
>>Even if only pieces of this are out there it could help. Or something similar.
>>
>>Any ideas?
>
>Test suites do not work for this purpose.  They are good for judging tactical
>strength, but poor estimators of game strength.  If you optimze for tactics,
>then the program will play poorly.  I can generate a large boost in the tactical
>strength of Beowulf by tuning using test suites.  Then it gets murdered in
>actual games.
>
>It may be that quiet moves could be a good indicator.

Yes, I agree that typical test sets fail to provide an appropriate balance of
positions from which one can work on improving their program for game
conditions.  Disclaimer: I am not a chess program author.

>Another possibility is to look at what Dave Gomboc did in his thesis.
>Seems like it would require lots of hardware, though.

Mmm, well, lots of hardware helps :-) but also a big help is to be in full
control of the software you're testing.  Theoretically I could change Crafty as
much as I liked, but in practice my experiments wouldn't have been valid if I
broke something in it accidentally, and besides, if there was extensive change
then Crafty would no longer be Crafty.  I was very careful about what I changed,
and almost always ended up doing things in slow ways using a relatively small
interface to Crafty's functionality that I was sure would work as opposed to
performing invasive code surgery, because my purpose was researching a novel
technique, not creating a highly efficient implementation.  There were even
orders-of-magnitude speed-ups I didn't implement but described in the future
work section.

I'm sure that someone who knows their chess program well could get much better
performance using my tuning method than I did with Crafty.  Nonetheless, even if
people don't adopt the technique I propose, if they read my thesis and come away
with improved understanding, I'm happy.

The thesis URL has changed (I glued the front and the main part together :-) but
 it's still available from http://www.cs.ualberta.ca/~dave.

Dave
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.