Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: To anyone who has ever automatically created an opening book

Author: Scott Gasch

Date: 12:52:25 11/12/01

Go up one level in this thread


On November 09, 2001 at 20:25:52, John Merlino wrote:
>
>To give a quick answer:
>
>Your RAM usage appears to be about the same as mine, but I find it very hard to
>believe that you can fully process a half-million game file and create a 90MB
>book file in 3 minutes. How is this possible? Just COPYING a 90MB file takes
>about 30 seconds on my PIII-600. A half-million game PGN file is approximately
>350MB, right?

Yes maybe 3 minutes is a stretch but its very fast, certainly under 8 min or so.

>But you're right about general memory availability. We really can't COUNT ON
>more than 30-50MB free at any one time (although a more typical machine today
>would probably have well over 100MB free). So, this limits us quite a bit.

There's a function on windows to give you a snapshot of current memory usage.  I
just assume its all available because my code is not public.  But you could make
a more educated guess about how much memory you have to work with by calling
GlobalMemoryStatusEx prior to allocation.  Of course there are problems with
this method... i.e. some other process could decide to allocate a huge chunk of
memory just after your engine called GlobalMemoryStatusEx and thus make the
statistics returned to you bogus...  But if I were writing a commercial engine
I'd make liberal use of GlobalMemoryStatusEx and VirtualLock when building the
book.

>My test was with a book with just under 150,000 games in it. It took about 250MB
>of RAM (which ended up requiring about 100MB of swapping on my machine), and a
>little less than 4 hours to process at a depth of 40 plies. The result (after
>pruning zero-weight branches, which is, in effect, the same as your "straining"
>process) was a file that was about 550K. If I had not pruned the zero-weight
>branches, the file would have been about 6.1MB. Admittedly, though, this timing
>is during a debugging process, and I have not actually tried running it with a
>release build.

What is taking it so long?  Is it swapping?  That will kill the speed of book
generation, of course.  Is the PGN reader just really slow?  Have you tried
profiling the code during book generation?  It might give you an aha.

>However, I think our PGN reader code is one of the main bottlenecks. It appears
>to only be able to import about 100 games/second, and nobody else has reported
>anything less than 2,000 games/second. That's probably something I should look
>at.

Yes I think my PGN reader is pulling in around 5k/sec.  I can only base this on
the time between error messages.  But I'll often see:

ParsePGN: Error in game 10200 at line 530595.
(a couple of seconds pass)
ParsePGN: Error in game 19402 at line 1042949.

So it's pretty speedy.  100 games/sec seems slow... is your between-game reset
taking that long?  Is moving the pieces slowing you down?  What's the
bottleneck?

Good luck.  I'll make a note to myself to run the book generator and send you
real book building numbers tomorrow.

Scott



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.