Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Binary Book Creation

Author: Robert Hyatt

Date: 06:15:25 06/21/00

Go up one level in this thread


On June 20, 2000 at 22:24:32, Adrien Regimbald wrote:

>Hello,
>
>>Here is a quick explanation.  On the front of the book, I have 32,768 indices
>>to "clusters" in the book.  A "cluster" is any book positions that have the
>>upper 15 bits identical (hence the cluster/index concept).  Once you know the
>>hash signature you want to look for, you take the upper 15 bits, index into
>>the cluster indices, and that gives you a point in the book.bin file that you
>>fseek() to.
>>
>>Now the cute part.  My hash signatures in the book are created in a different
>>way than the usual hash signatures.  The upper 16 bits of the book hash comes
>>from the _parent_ position...  The right 48 bits comes from the _child_
>>position.
>>
>>Why?
>>
>>This puts all book positions that have a _common_ parent in the same cluster
>>(yes, along with some others that have the same common parent signature but
>>are not relevant).  I now fseek to the cluster, do one I/O, and I get _all_
>>possible book positions from this parent position, rather than randomly doing
>>I/O all over the book.
>>
>>Works great for very fast games.  I build the book like this:
>>
>>Parse every move and emit a normal hash signature as above (16 bits for parent
>>position, 48 bits for new child position).  I then sort this (I sort it in
>>chunks to avoid needing gigabytes of memory.)  Now I fseek() beyond the first
>>32K indices (128kb) and start reading the sorted chunks and merging them.  At
>>any time the upper 15 bits change, I update the last cluster start, write out
>>the completed cluster (first 4 bytes of a cluster is number of moves in the
>>cluster).  I continue until all the sort files are merged into the book file,
>>then I back up and write all the cluster indices I have been storing in an
>>array.
>>
>>Quick, easy, and _very_ fast to access.  And it prevents some ugly
>>transpositions due to the hash signature having the parent and child signatures
>>merged.  IE e4 e5 Nf3 a6 Bb5 and black might well play Nc6 which leads to a
>>known book position.  Of course axb6 would be much better.  This scheme at least
>>defeats that kind of stuff, although it also causes a few useful transpostions
>>back into book to be delayed for one move.
>
>
>Hmm.. I never thought of that..  I have one doubt plaguing my mind though:
>It seems that this sort of parent/child setup would be good for lightning/very
>fast games where it is frequent to make blunders of this sort .. but in a
>standard length game against a strong opponent, I have the impression that you
>will not reach that sort of situation, and are much more likely to benefit from
>getting the transposition back into book instead..


The problem is that there are tricks to deal with, and this handles most of
the problems handily, without stepping into other problems.  You certainly don't
want to allow your opponent to do this:

e4 e5 nf3 Nc6 Nf1 Nb8 as you are nearing a draw without thinking at all. .
Crafty wouldn't play Nb8 from book, due to the parent-child signature, so it
would search.  It might still play Nb8 (very doubtful) the first time, as it
knows it is only a 2-fold and not a 3-fold repetition.  But it would have to
find Nb8 by search, assuming it is reasonable.  Then it would play it, and now
it would discover that it is back in book.

>
>
>One question - is the grouping of parent/child positions like that all together
>really beneficial in practice?  I was under the understanding that binary file
>access is pretty much constant-time regardless of the position in the file,
>unless you need to switch cylenders a lot, or the binary file is scattered
>completely - no contigous blocks.  What size of a book would you need to
>actually notice a benefit with this?


It is 100x faster to read in one 1K chunk of data than it is to do 35 fseek()
calls and 35 10-byte chunks.  Just moving the read/write head around is bad
enough as you contend with other things that are also moving the read/write
heads.  This is why disk cache software works so well.




>
>
>Regards,
>Adrien.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.