Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Building an opening book

Author: Stuart Cracraft

Date: 12:33:48 04/07/98

Go up one level in this thread


What I did was take all the main lines of ECO and then simply
use the standard Zobrist hashing that's used for the program's
in-memory transposition table to manage a simple file-based
binary book. It is very simple and works for smaller opening
books like this one relatively quickly. It would not scale
to large opening books but I haven't tried. Simply sorting
the binary book and using a binary search would
permit it to scale to much larger though.

First, use the Zobrist method to calculate a unique key for the
position.
Store part of the key with the position in a struct along with the
the other part of the key, e.g. the same way you do your hash tables.

Now read in all the lines/games that will make up your opening
book and each move, take the hash key, store it in a array of struct
after checking the array to ensure it is not already in there. Use
a good-sized array.

When the array fills up with unique hash keys, write it out with
write().
When you've processed all input files (lines, games, etc.), you have a
choice to sort the resulting binary file by key or leave it unsorted.
If
the file is very large, you may not have enough main memory to sort it.
File-based sorting can be studied in Knuth's books to sort these really
large books.

For small books (mine is about 80,000 positions unsorted), I can
do a simple linear search using read() to read the file on disk into
memory one time, check for any matches of the positions reachable
from the current position comparing the hash keys as well as the
secondary hash key, and putting them into the "opening book
moves list" and then use a simple algorithm to pick amongst them
to suit user operation before outputting your book move.

I can read my entire bucket into main memory in one fell swoop. If
no book moves are available, I can free up the memory for my regular
hash tables (position hash and pawn hash) and do a normal search. If
I get several such failures several moves in a row when it is the
computer's time to move, I disable the book and don't use it, otherwise,
I check if it is already in memory and avoid the disk read.

If you get ambitious, do a file-based sort and use binary search
to search the disk file, if you can't read it fully into memory or
implement a hash algorithm like the dbm() ones on Unix for your opening
book management, for a really large opening book capability. I didn't
do this for the PC since it seemed like too much trouble and my book
wasn't (yet) very large.

As for whether to include the move in the book (really the position's
hash key after the move since you want a position-based book, not
a move-based book, so that transpositions will handle seamlessly),
suggest you include only the moves for the winning side.

If you get ambitious for that, minimax your book. Results reported by
Ken Thompson for Belle have not really been encouraging enough to do
that amount of work since the return to him/Belle was not signficiant.

Also, I have a book compile option that scores each book position as it
is stored in the in-memory array, prior to disk-save, ith a short 1 or 2
ply
search with quiescence and saves the score in the book along with the
position's hash key. The program then has options/modes to "pick the
best
scored moved", "pick randomly among the top N scores", "pick randomly
among
the last N moves -- in an ECO organized book, this would produce very
strange book moves from ECO." So the book can have variety but pick
positions that its evaluation function generally likes. Not a full
minimax, but enough to avoid
weird gambit positions that lose material or make the program's
evaluation function struggle for quite a few moves to even a perceived
positional disadvantage.

You can avoid book collisions with a simple extension of the above since
your
position space obviously maps to a much smaller hashkey space.

The above algorithm has been entirely adequate for my book which is
simply
the ECO main lines, some supporting lines, and some master games. It is
enough to save enough time in the opening that the middle-game can use
more time. I use Hyatt's algorithm of providing more time for the first
10 moves after the last book move scaling down, a curve that looks like
the GM's use of time in the game, as described in the ICCA Journal some
years back.

--Stuart

On April 06, 1998 at 20:47:22, Brian McKinley wrote:

>I am working on a losers chess program and it plays very poorly in the
>opening.  The program does not currently support an opening book.  I
>would like to collect games of strong players on ICC and build an
>opening book based on these games, but I am not sure how to go about it.
>
>I have seen references here and one ICC that lead me to believe that the
>most common method for creating an opening book is from a game database.
>
>Are there any utilities available to help me in this endeavor?  If so
>what format do they require the games data to be in, and how do I get to
>that format from pgn?  Is the format of the opening book output
>documented or is there example code to read it?
>
>If there are no utilities available, is there any other criteria I
>should consider besides win vs. loss when determining whether to include
>the opening moves in the book?
>
>Thanks in advance
>
>Brian



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.