Computer Chess Club Archives




Subject: Parallel Crafty

Author: Robert Hyatt

Date: 20:48:39 03/19/98

Just a quick status report.  I've accomplished the following so

1.  I have implemented a Cray-Blitz (Dynamic Tree Splitting
algorithm) type of data structure and tested this thoroughly
to be sure that the original version and new version produce
identical node counts for all 300 WAC positions searched to a
fixed depth of 7,8,9 and 10 plies.

2.  My first cut at a parallel search is something quite a bit
"less" than the full-CB algorithm. I'm currently doing what I
would classify as a "cheapo" young-brothers wait algorithm.
After implementing this, I then spent a couple of days making
sure that *this* threaded version (but using only one simultaneous
thread (one up to a split point, one following that split point)
would also search the same node counts on the same WAC test above.
This exposed a couple of global variables I had forgotten about.
This has now been completed.

3.  I am now testing this YBW algorithm, but am constraining all of
the processors to work together at one node in the tree.  This is
not efficient at all, but it has exposed a few global variable updates
that I missed in 1/2 above.  This phase seems to be playing correct
chess.  I am now running wac with 1 processor and will follow that with
2 processors (all I have until my P6/200 X 4 machine shows up tomorrow
or monday [it apparently got snowed in at Denver airport]).  This is
again just an attempt to make sure that the parallelism has not broken

I have used posix threads for this, which means this code presently
will compile cleanly on linux and solaris for certain.  SGI and others
have also implemented posix threads, so any unix box should be a good

I don't yet have any solid performance numbers and don't expect them
to be very good yet.  YBW is not a great algorithm, and then
constraining all processors to work at a single node further reduces
the effectiveness.  I've seen several cases where the parallel crafty
claims to search 180K nodes per sec on a dual PII/300 machine, where
one processor reports 120K.  So it is getting better.

Other details:  As per my Ph.D. dissertation, I do *not* split the
tree at the root.  I don't believe that is reasonable since when the
search changes its mind, that root node is far bigger than the rest,
excepting the first.  So I have always, and still do, search the root
moves one by one, and let the parallel threads break things up further
out in the tree.

I am, as I said, using Posix threads, AKA lightweight processes.  As
a result, I can start/stop threads with practically no overhead, which
makes the code *extremely* readable.  This also has a chance to port to
NT once it is fully "enabled".

The only thing I have not tried yet is tablebase probes.  I have not
looked carefully enough at Steven's code to be sure it is thread-
safe, *yet*.  This has gone remarkably well to this point.  Now the
issue becomes performance...

More as I get this further developed, and particularly after the
quad-P6 machine arrives.  I can now smell a multiprocessor alpha
in my future.  :)

It will be a bit before this is released, but it is coming along...


This page took 0.02 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.