Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Short Report from Frankfurt (Anand-Fritz)

Author: Don Dailey
Date: 14:15:15 06/22/98
On June 22, 1998 at 14:47:35, Robert Hyatt wrote:

>On June 22, 1998 at 13:29:31, Don Dailey wrote:
>
>>On June 22, 1998 at 11:26:42, Robert Hyatt wrote:
>>
>>>On June 22, 1998 at 10:50:01, Don Dailey wrote:
>>>
>>>>On June 22, 1998 at 09:51:27, Robert Hyatt wrote:
>>>>
>>>>>On June 22, 1998 at 02:37:40, Roland Pfister wrote:
>>>>>
>>>>>>I was on saturday at the townhall to watch Anand - Fritz and perhaps
>>>>>>meet some computer chess freaks.
>>>>>>
>>>>>>But the only persons I knew were the two Matthiasses from ChessBase
>>>>>>and Frederic Friedel. Frederic was busy photgraphing or videoing, so
>>>>>>I had a small talk with Matthias W. and he told me that Fritz played the
>>>>>>open too, at that moment having 5 points out of 5. Not bad.
>>>>>>
>>>>>>Then Mathias F. arrived, he was the one to operate Fritz. I asked him
>>>>>>how they did parallize Fritz and he answered "MTDF", because they
>>>>>>did not have the time to do something clever like Bob.
>>>>>
>>>>>this sounds like they did something very ugly, like searching a different
>>>>>mtd(f) on each processor?  But in any case, if they used mtd(f), then we
>>>>>didn't see the "real fritz" play chess here, because mtd(f) is also not
>>>>>quick and dirty to implement.  It requires lots of algorithm changes, it
>>>>>*definitely* requires significant hash-table changes, and so forth.  Seems
>>>>>like they were too interested in going faster, but overlooked difficulties
>>>>>in getting stronger.
>>>>
>>>>I think mtd(f) is a pretty logical choice for a parallel machine
>>>>if you were in a big hurry.  I think the key thing that they must
>>>>have considered was how clean and simple the algorithm is and
>>>>
>>>
>>>However, remember that I played with this too, and there are plenty of
>>>issues to keep someone busy for several weeks to get it right.  And if it
>>>is not "right" it will be slower than one processor using normal PVS.
>>
>>There are two issues,  getting it perfect, and getting a usable
>>implementation up and running.   Getting it perfect is something
>>I have yet to do in Cilkchess and we've both talked about what a
>>pain in the butt this is.   But the basic algorithm is pretty
>>easy to implement correctly (simpler than PVS) and I assert that
>>this is easily good enough, and that the other problems can be
>>ignored if you are in a hurry.  The hash tables will require a
>>minor change since there is only a single bound.   Our hash table
>>in Cilkchess is pretty much the same other than this, but if you are
>>aware of some refinements to the hash tables to make them more
>>compatible with mtd(f) I would like to check them out.
>
>as I reported months ago, I got better results adding another word to
>the table to store both bounds, so that when I hit on one side of the
>score one iteration, and then flip to the other side, I won't lose all
>the bounds from the first search, in case I get back on that side of
>the true score again...  wasn't a huge savings, but it certainly made a
>difference
>
>
>>
>>The tough issues in mtd are strange PV's, and in my opinion lazy
>>evaluation.  Not dealing with these things will make your program
>>a little weaker, but it's nothing to worked up over if you are
>>trying to get something up and running fast.  mtd's extra speed
>>may make this a wash anyway, depending on how much your program
>>relies on lazy evaluation (and also how much your program benefits
>>from mtd, which I believe can vary a lot from program to program.)
>
>when testing, I finally had to turn lazy eval off.  It was killing me
>by requiring far too many re-searches, since the eval would slowly drop
>as the lower bound dropped... or vice versa.. and that was no good.
>
>
>>
>>You may remember that I posted earlier that I believed certain types
>>of programs may be much better suited to mtd than others.  I don't
>>know where Fritz fits in to all of this.
>
>
>the *huge* issue I'd see is assembly language.  I did this in Cray Blitz.
>And it took months to debug everything, since all the data structures were
>optimized for max speed, and not for parallel threads.
>
>
>>
>>
>>>the zero width window searches which is ideal for parallel chess.
>>>>
>>>>Yes, there are side affects that eventually create lots of
>>>>implementation work later but these are not so serious that
>>>>cannot be ignored (if you are in a big hurry.)
>>>>
>>>>I'm guessing that they are not doing what you conjectured, but are
>>>>simply spawning off sibling searches in parallel with of course
>>>>mtd's zero width window.  I wonder if they have abort in the
>>>>code or simply wait?  I can imagine these considerations might
>>>>very well push you to consider mtd(f) in a parallel program.
>>>>
>>>
>>>that's what I conjectured, in fact, that they are doing the mtd(f)
>>>iterations at the root in parallel with each processor searching a
>>>different "window".  Doesn't sound particularly good, however, based on
>>>normal performance considerations.
>>
>>No, you misunderstood me.  I didn't mean at the mtd driver level but
>>recursively during the search using the same window.  I agree that if
>>they are doing what YOU said this is pretty cheesy, but I don't think
>>Franz would consider this.
>
>
>true, but if he didn't, my approach is actually far easier than mtd(f) as
>you don't change the basic search code, evaluation, hashing, or anything else,
>just some data structures and the thread code.  But the basic search engine
>itself needs only a couple of lines added and it is done, then you write some
>new code that is really pretty easy (but, as I said, the data structures are
>pretty complex, but they are available in Crafty already.)
>
>>
>>
>>>>Another possibility for easy implementation is mtd(f) with
>>>>young brothers wait, then spawn off siblings in parallel with
>>>>no aborting.  Not having to worry about re-searching must also
>>>>be a desirable characteristic for a quick and dirty
>>>>implementation and another reason they might have considered
>>>>mtd(f).
>>>>
>>>
>>>
>>>If they had done this, they could have used the code directly from Crafty
>>>as it quite easily does this, although the data structures are on the
>>>complex side.  But the statement "we chose mtd(f) rather than something
>>>more clever like Bob did" seems to say they don't split *inside* the tree
>>>at all...
>>
>>I haven't looked at your implementation but I understand how this
>>might be implied.  In view of all of this,  I will change my
>>"educated guess" to be that they spawn off all work in parallel
>>at the root level of the tree.  The driver is serial and they
>>probably do not abort work.  This would be the simplest
>>implementation and not ridiculous with zero width searches.
>
>
>this was our *first cut* in 1983 with Cray Blitz (non mtd() of course).  But
>it has a terrible performance limit.  If you change your mind multiple times
>at the root, this can work well.  If you stick with one move, you have a real
>upper bound on parallel speedup that is in the 1.5 range on a good day, and
>using null-move like fritz, is more like 1.1 or less...
>
>better branching factor means this works worse and worse.
>
>
>>
>>Of course your idea of doing mtd(f) probes in parallel is a
>>possibility too and if they were REALLY desparate they might
>>try this since I agree it is the simplest of all.
>>
>>There is a much simpler algorithm however but its fairly clear
>>they are not using it.  You can start each processor running
>>a full search, without worrying about scheduling issues.
>>Instead, you simply let the hash table do the scheduling work.
>>You store a counter, for each node of how many processors are looking
>>at a given node and have each processor order their moves based
>>on these counters.  I'm not even sure this is necessary if your
>>moves are ordered somewhat randomly, at least the weaker moves
>>at the end of the list.   I'm not sure this algorithm would be
>>as effective in a distributed processor/memory  environment.
>>
>>But I have always believed these issues to be fairly trivial, I
>>think 95% of the implemenation of a parallel program is how to
>>manage multiple states, 1 for each processor.  If your
>>program updates a global chess board with accompaning data
>>structures then reworking it to deal with this must be by far
>>the most tedious part.  I can easily imagine Franz spending most
>>of his time implementing and debugging this.   The only really
>>difficult issue left is how to abort a parallel search, which
>>he may not have bothered with.  I'm not even sure if this is
>>that difficult as I've never had to do it (I just call "cilk_abort"
>>in Cilkchess which aborts all work currently in progress that was
>>spawned by the current function.)
>>
>
>I keep a "threaded list" showing who is working for whom, at which node, so
>that I can abort everybody that is helping me at a place where I discover that
>I don't need their results due to a fail-high.
>
>
>>When Aske Plaat implemented a parallel version of Crafty, he
>>spent almost all of his time struggling with all the global
>>state Crafty carries around because Crafty was a serial program.
>>I don't think he had any problems with the scheduling algorithms
>>or actual parallel stuff.  Of course Aske is an expert in this
>>regard.
>>
>
>
>I spent two weeks turning all the global tree state into something accessed
>via a pointer, and then making node counts match between the two on all of the
>win at chess positions, as a sanity check.  Was the hardest part of the project,
>although handling the recursive search issues so that any processor can help any
>other processor (no master/slave relationship anywhere in crafty, everything is
>peer-to-peer in the search) was a pain to get right.
>
>
>
>>I guess the bottom line is that we can only speculate.  I suspect
>>we both envision different approaches he may be using based on
>>our own experiences with various algorithms like mtd(f) and
>>parallel searches.
>>
>>- Don
>
>
>I almost suspect this might have been a press-release issue more than anything
>else.  I don't see any way to "quickly" write a parallel search for a highly
>optimized hand-coded piece of assembler code.

Agreed!  The vision of Franz having a few hours to kludge something
together quickly is a little hard to swallow.

- Don
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.