Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How to use a test suite to evaluate program strength?

Author: Dann Corbit

Date: 22:51:42 10/31/98

Go up one level in this thread


On October 31, 1998 at 20:56:37, Robert Pawlak wrote:
>Thanks for your reply Manfred...
>On October 31, 1998 at 10:57:30, Manfred Rosenboom wrote:
>>I don't think, that there is any relationship between ANY test and
>>ELO or USCF rating. You will only get some kind of trend, how a
>>certain program handles certain kind of positions, that's all.
>
>This was not quite what I wanted to hear... I know there is some controversy
>over what test suite is 'best', but I thought I could at least get a rough
>ballpark estimate...
>
>How about this - in general, assuming I am running something like Genius 3 or
>Fritz 3 on this machine (386/20, 4 M RAM), what kind of performance can I
>expect??? 2300 ELO?
You can't make a direct comparison.  In fact, the very best problem solver
programs can't even play a game of chess.  They just solve mates.  I am
beginning a project that will solve all problem sets that it has seen, because
it will use a database.  So a chess problem will be remembered when it sees 'bm'
and next time it will solve instantly.  But when it meets an unfamiliar
position, it will probably not be all that great.

Some kinds of problem positions will definitely fool the best chess playing
programs.  For instance, null move pruning programs will be a lot better overall
than those that don't.  But this very technique will make them blind to some
unusual problem sets.  So to be able to solve a large problem set may actually
be an indication that a program is *inferior*, depending upon how the problem
was actually solved.

The best chess programs are stronger than most people are already.  So the
normal Joe on the street can't beat them anyway.  If you happen to be a GM or
have a good friend who is one, then the playing strength may be a big issue.

Instead, why not get a chess program for what it does best.  The "pirates coming
over the wall" attacks of CS Tal can be fun.  The overall solid play of Rebel or
Fritz are both wonderful joy.  You can get the guts of a program like Phalanx or
Crafty and trace along or study the algorithms and find out exactly how it
*works*.  So why not go to a chess store and try them out?  Read a few reviews.
Ask in a chess newsgroup about topics you are interested in.  Get the program
that does what you find most interesting.  Or get several.  Chess software is
unbelievably cheap.  Most top programs are less than $100.

I like to trace through games played by chess programs with Winboard.  Once in a
while you will see them pull a real stunner.  Get this if you like:
ftp://38.168.214.175/pub/BENCH2.ZIP
It is over 6000 test positions in EPD format.  Try it with your favorite program
and see how many it solves at different time controls.  You might even like to
estimate what time control would be necessary to solve them all.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.