Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SAN problem with Crafty's wall.pgn book

Author: Robert Hyatt

Date: 14:32:53 06/18/98

Go up one level in this thread


On June 18, 1998 at 16:50:19, Edward Screven wrote:

>my understanding of the crafty book building procedure is that
>you scan a pgn input file, streaming <position,win,loss,draw>
>records through an aggregating sort, and it's the disk sort
>runs that require lots of space.
>
>if this is correct, then a simple way to reduce your temporary
>space requirements by 1/N, at a cost of making N passes over
>the pgn input, is to partition the position keys into N equal
>sized ranges.  make N passes over the input pgn file.  on the
>i-th pass, discard all positions which are not in the i-th range.
>the independently sorted results of each pass can be appended to
>your final output file.
>
>  - edward


here's what happens.  I first parse the pgn and output (now) a 9
byte record for each move, 8 byte hash signature, 1 byte with result
of the game (3 bits) and the !/?/etc flags (5 bits).  This is streamed
out to a file.

I then read this back in a huge chunk at a time, into a memory buffer,
call qsort() to sort (not a disk sort) and then save each of these
chunks in a separate file.  Just as I finish this, I have two copies
of everything (note I now write 9 byte records, I was writing 20 byte
records [linux] or 24 [windows]).

I now delete the original unsorted input, then do a simple N-input
merge and write the book out with some indices on the front to give me
quick access.

It now takes 9/24th of the space it used to take in windows, and9/20th
of the space it used to take under unix.  I got rid of the long long in
the structure, so there is no alignment or padding, and simply use memcpy
to move things around at a big savings in disk space.

note that this is version 15.15, which is not yet out, but is working
well on ICC using this new book format.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.