Author: Robert Hyatt
Date: 20:48:39 03/19/98
Just a quick status report. I've accomplished the following so far: 1. I have implemented a Cray-Blitz (Dynamic Tree Splitting algorithm) type of data structure and tested this thoroughly to be sure that the original version and new version produce identical node counts for all 300 WAC positions searched to a fixed depth of 7,8,9 and 10 plies. 2. My first cut at a parallel search is something quite a bit "less" than the full-CB algorithm. I'm currently doing what I would classify as a "cheapo" young-brothers wait algorithm. After implementing this, I then spent a couple of days making sure that *this* threaded version (but using only one simultaneous thread (one up to a split point, one following that split point) would also search the same node counts on the same WAC test above. This exposed a couple of global variables I had forgotten about. This has now been completed. 3. I am now testing this YBW algorithm, but am constraining all of the processors to work together at one node in the tree. This is not efficient at all, but it has exposed a few global variable updates that I missed in 1/2 above. This phase seems to be playing correct chess. I am now running wac with 1 processor and will follow that with 2 processors (all I have until my P6/200 X 4 machine shows up tomorrow or monday [it apparently got snowed in at Denver airport]). This is again just an attempt to make sure that the parallelism has not broken anything. I have used posix threads for this, which means this code presently will compile cleanly on linux and solaris for certain. SGI and others have also implemented posix threads, so any unix box should be a good candidate. I don't yet have any solid performance numbers and don't expect them to be very good yet. YBW is not a great algorithm, and then constraining all processors to work at a single node further reduces the effectiveness. I've seen several cases where the parallel crafty claims to search 180K nodes per sec on a dual PII/300 machine, where one processor reports 120K. So it is getting better. Other details: As per my Ph.D. dissertation, I do *not* split the tree at the root. I don't believe that is reasonable since when the search changes its mind, that root node is far bigger than the rest, excepting the first. So I have always, and still do, search the root moves one by one, and let the parallel threads break things up further out in the tree. I am, as I said, using Posix threads, AKA lightweight processes. As a result, I can start/stop threads with practically no overhead, which makes the code *extremely* readable. This also has a chance to port to NT once it is fully "enabled". The only thing I have not tried yet is tablebase probes. I have not looked carefully enough at Steven's code to be sure it is thread- safe, *yet*. This has gone remarkably well to this point. Now the issue becomes performance... More as I get this further developed, and particularly after the quad-P6 machine arrives. I can now smell a multiprocessor alpha in my future. :) It will be a bit before this is released, but it is coming along... Bob
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.