Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Ultimate Use of Suites of Test Positions???

Author: Mike S.

Date: 13:59:55 12/07/02

Go up one level in this thread


On December 07, 2002 at 16:02:55, Bob Durrett wrote:

>On December 07, 2002 at 14:32:53, Robin Smith wrote:
>(...) So what purpose
>>is served by a set of test positions?  The only advantage I can see for such a
>>test is to get quicker results for programs not tested yet by SSDF.  But the
>>results will also always be highly suspect.

>It would, IMHO, have very little value for testing the absolute top engines, and
>would not serve to identify "the very best" engine.  This is because the top
>engines are too close to each other in strength.  But it would/should be VERY
>useful for people in the process of developing new chess-playing programs (...)

Another (important, IMO) purpose is to test for the *analytical skills* of an
engine, rather than to determine which is best in standard gameplay. According
to several polls, analysis is the major way chess programs are used.

Early test suites, back in the board chess computer times, were intended to test
the general strength (or at least hoped to be able to do that; it never was 100%
reliable anyway). But when engines became stronger, people realised two things
related to tests and their purpose:

1. engines are analysis tools (too), and that became the most important use
2. the situation engines face during a test, is similar to analysis

(Of course I mean "manual interactive" analysis, when engines are used in
interesting positions only, not automatical analysis functions.)

So it is obvious that test suite results can be representative for analysis
skills, much better than for the overall strength for gameplay. There are
gameplay factors like time consumption, style, effectiveness of certain
"technical" things like hash learning, etc. which can't be tested with
traditional methods, other than simply by engine matches.

Something which has been done ever since are topic specific test suites, like
endgames, unerpromotion and the like. A correspondence player who wants to
decide which engines to choose for which type of position, will want to see
detailed results grouped by thess topics (i.e. opening/middlegame/endgame, or
combos (attack/defense)/endgame knowledge/positional play, etc.). For an overall
result, he would have to see, if the test suite(s) have the best mixture of
various types of positions which he thinks is representative for the common
practice. I think it is an individual choice, what the best mixture is.

But I guess very strong players who use engines for "serious" important
analysis, i.e. opening novelties for a GM tournament, will also choose engines
by studying their games and/or by playing against them themselves. It is an
additional motivation to use test suites, when the fan (like me) isn't strong
enough for these methods. It is a way to watch strong engine performances and
discover differences between them even when the own strength is hundreds of elo
points less :o)

IOW, to some extent computer chess tests end in itself.

Regards,
M.Scheidl



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.