Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Compiling Crafty's enormous PGN into an opening book

Author: Steven Edwards

Date: 13:11:51 09/25/03

Go up one level in this thread


On September 25, 2003 at 11:27:45, Robert Hyatt wrote:
>On September 24, 2003 at 20:27:02, Steven Edwards wrote:
>>On September 24, 2003 at 19:54:35, Robert Hyatt wrote:
>>>On September 24, 2003 at 18:29:05, Steven Edwards wrote:

>>The resulting opening book requires O(log2(sqrt(N))) probes to fetch a entry.
>>This is fast enough to use the book at the upper levels of the search and other
>>selected interior nodes.

>That seems pretty slow to use in the search.  IE egtb probes are O(1).  My
>opening book is also O(1) for any single node, as I "group" all moves from
>a position together in one block.

Yes, storing moves grouped together by parent position is what I did in my old
program Spector.  I decided to simplify the book somewhat to have fixed length
entries this time around.  This also simplifies the color reversal scheme; all
data is stored WTM and each book entry is potentially doubled for BTM when
either player has somehow lost a tempo in the opening.  The book layout also
makes it very easy (and O(Npos) fast) to produce new books from merging any
number of previous books simultaneously without any sorting.

Incidentally, this is the only place in the toolkit where data is accessed,
scored, or represented as WTM.  Everywhere else the data is based on the active
(on the move) color.  I think this makes the code a bit cleaner and removes the
WTM conditionals that would otherwise pepper the source.

>A book with 1M entries will need 10 probes roughly, for your complexity as
>given.  IE sqrt(1M) = 1K.  Log2(1K)=10.  I don't want to do 10 I/Os that
>will take about 5ms each on a 15K SCSI drive, at any point in the tree.

It turns out that the total I/O difference is less and may often be in favor of
the toolkit.  First, all the subnodes of a position are usually only needed at
the root, so the time delta there is negligible.  Second, for interior search
nodes, usually only a few (and maybe only one) immediate descendents of a
position need probing, not all of them.

On another note, I did some more tests with the book compiler and it turns out
that nearly all of the time used is being spent in the run time support for the
<strstream> /<sstream> package.  This is okay with me as book compilation is
something done rarely and never under competitive time controls.  The toolkit
source is about 1 MB of C++ and in all of the code there is only one fixed
length character array type: the SAN encoder buffer.  Everything else uses
ostrstream and istrstream objects, so: 1) I never have to worry about buffer
length overflow checking, 2) I don't have to figure out and hard code the
maximum length of any particular string variable, 3) the toolkit never allocates
more storage than needed, and 4) the toolkit can handle arbitrarily long data
input without problems.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.