Computer Chess Club Archives

Search

Terms
Messages

Subject: opening positions test suite

Author: Jay Scott
Date: 16:29:42 02/06/98
To create a test suite we need a bunch of positions that we're
pretty sure we understand. One day I realized there's an
abundant source: opening books.

I can think of lots of kinds of test suites that can be made
from opening positions:

(1) There are many positions where a set of best moves is
known. For example, in the initial position the best moves
are considered to be 1. e4, 1. d4, 1. c4 and 1. Nf3. It's
not obvious that 1. Nc3 is worse (and some programs will
play 1. Nc3 with opening book turned off). Of course, you
may want to include only positions with a single best
move, but that's not necessary.

(2) There are many unbalanced positions which are thought
to be dynamically equal. A test suite could include positions
like the King's Gambit (1. e4 e5 2. f4) or the Ruy Lopez
Exchange Variation (1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Bxc6)
where the sides accept different kinds of advantage, and
score a program as "successful" if it evaluates the positions
as approximately equal. Of course that means the programs
have to return genuine scores, and it'll be hard to get
consistent scoring between programs. But this could be
a handy way to evaluate changes to a single program.

(3) The problem of comparing scores can be solved by creating
a test suite of pairs of positions. After 1. e4 e5, the move
2. f4 is at least as good as 2. d4, so we could feed both
positions to a program and score it as "successful" if it
evaluates the position after 2. f4 at least as highly. The
program still has to return genuine scores. Or we could
pair any position known to be bad with any position
considered to be equal, and so on.

(4) A test suite could be constructed automatically from
a game database: if people reach a position frequently
and the results are about even, we can guess that it's
an even position and accept the best moves from the
database. Uneven positions may be interesting too. This
kind of automatically-generated test suite may not be
as reliable as a hand-made one, but it's easy to create.
Unlike automatically-generated opening books, we don't
have to include every position, and we can narrow the
suite down to positions that have convincing statistics.
For programs that have played enough games, like
crafty, the game database could be restricted to its
own games to guarantee relevance--here the test might
be to see whether the program's evaluation reflects
the outcome statistics of the position.

A test suite made from opening positions would naturally
cover both tactical and positional considerations,
unlike most test suites. Because the positions have
been deeply investigated by many people, there won't
be as many disagreements and errors (but you'll never
get away from errors altogether).

Disadvantages to test suites made from opening books:

- There won't be many endgame positions. :-)

- Programs can already play the positions well, with their
opening books. So arguably the test positions are all
irrelevant.

- Human openings are made for humans. Maybe the best moves for
a human to play aren't the best for a program.

- To create one by hand, you'll need an opening expert.

  Jay
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.