Author: Robert Hyatt
Date: 12:44:13 01/06/98
Go up one level in this thread
On January 06, 1998 at 15:35:18, Stuart Cracraft wrote: >Maybe you should consider using an extensive just-checkmates >suite or group of suites to test your parallel program. > >I use two combination suites, one mate suite, one endgame suite, >and two rating tests, plus play multiple games at different time >controls. > >--Stuart Doesn't help. The problem, in this case, required the following: 1. N-1 legal moves where N was the number of processors; 2. just the right number of interrupts so that the critical timing hole was activated, allowing that "extra" processor to be delayed just long enough to report back *after* at least one of the other processors had completed a search (something normally impossible since it is always faster to search *nothing* than to search even one node...) IE, it's not the problem positions, it's the bizarre timing problems that occur in a program written on a machine which is built around shared memory. IE any processor can write to any memory location whenever it wants to. And it's up to the programmer to use semaphores to prevent the inevitable race conditions that might happen. I believe that I found at least one parallel search bug in Cray Blitz every time we moved to a different machine, either one with a faster clock or with more processors. *every* time. Correctness means one thing to a serial program, something else entirely when discussing parallel programs. Problems are very similar to those I handled for 20 years doing operating system kernel development. An interrupt at an inopportune time can change something in a way you might never expect. At a time when the whole world can see it happen, too. :)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.