Author: Robert Hyatt
Date: 11:47:35 06/22/98
Go up one level in this thread
On June 22, 1998 at 13:29:31, Don Dailey wrote: >On June 22, 1998 at 11:26:42, Robert Hyatt wrote: > >>On June 22, 1998 at 10:50:01, Don Dailey wrote: >> >>>On June 22, 1998 at 09:51:27, Robert Hyatt wrote: >>> >>>>On June 22, 1998 at 02:37:40, Roland Pfister wrote: >>>> >>>>>I was on saturday at the townhall to watch Anand - Fritz and perhaps >>>>>meet some computer chess freaks. >>>>> >>>>>But the only persons I knew were the two Matthiasses from ChessBase >>>>>and Frederic Friedel. Frederic was busy photgraphing or videoing, so >>>>>I had a small talk with Matthias W. and he told me that Fritz played the >>>>>open too, at that moment having 5 points out of 5. Not bad. >>>>> >>>>>Then Mathias F. arrived, he was the one to operate Fritz. I asked him >>>>>how they did parallize Fritz and he answered "MTDF", because they >>>>>did not have the time to do something clever like Bob. >>>> >>>>this sounds like they did something very ugly, like searching a different >>>>mtd(f) on each processor? But in any case, if they used mtd(f), then we >>>>didn't see the "real fritz" play chess here, because mtd(f) is also not >>>>quick and dirty to implement. It requires lots of algorithm changes, it >>>>*definitely* requires significant hash-table changes, and so forth. Seems >>>>like they were too interested in going faster, but overlooked difficulties >>>>in getting stronger. >>> >>>I think mtd(f) is a pretty logical choice for a parallel machine >>>if you were in a big hurry. I think the key thing that they must >>>have considered was how clean and simple the algorithm is and >>> >> >>However, remember that I played with this too, and there are plenty of >>issues to keep someone busy for several weeks to get it right. And if it >>is not "right" it will be slower than one processor using normal PVS. > >There are two issues, getting it perfect, and getting a usable >implementation up and running. Getting it perfect is something >I have yet to do in Cilkchess and we've both talked about what a >pain in the butt this is. But the basic algorithm is pretty >easy to implement correctly (simpler than PVS) and I assert that >this is easily good enough, and that the other problems can be >ignored if you are in a hurry. The hash tables will require a >minor change since there is only a single bound. Our hash table >in Cilkchess is pretty much the same other than this, but if you are >aware of some refinements to the hash tables to make them more >compatible with mtd(f) I would like to check them out. as I reported months ago, I got better results adding another word to the table to store both bounds, so that when I hit on one side of the score one iteration, and then flip to the other side, I won't lose all the bounds from the first search, in case I get back on that side of the true score again... wasn't a huge savings, but it certainly made a difference > >The tough issues in mtd are strange PV's, and in my opinion lazy >evaluation. Not dealing with these things will make your program >a little weaker, but it's nothing to worked up over if you are >trying to get something up and running fast. mtd's extra speed >may make this a wash anyway, depending on how much your program >relies on lazy evaluation (and also how much your program benefits >from mtd, which I believe can vary a lot from program to program.) when testing, I finally had to turn lazy eval off. It was killing me by requiring far too many re-searches, since the eval would slowly drop as the lower bound dropped... or vice versa.. and that was no good. > >You may remember that I posted earlier that I believed certain types >of programs may be much better suited to mtd than others. I don't >know where Fritz fits in to all of this. the *huge* issue I'd see is assembly language. I did this in Cray Blitz. And it took months to debug everything, since all the data structures were optimized for max speed, and not for parallel threads. > > >>the zero width window searches which is ideal for parallel chess. >>> >>>Yes, there are side affects that eventually create lots of >>>implementation work later but these are not so serious that >>>cannot be ignored (if you are in a big hurry.) >>> >>>I'm guessing that they are not doing what you conjectured, but are >>>simply spawning off sibling searches in parallel with of course >>>mtd's zero width window. I wonder if they have abort in the >>>code or simply wait? I can imagine these considerations might >>>very well push you to consider mtd(f) in a parallel program. >>> >> >>that's what I conjectured, in fact, that they are doing the mtd(f) >>iterations at the root in parallel with each processor searching a >>different "window". Doesn't sound particularly good, however, based on >>normal performance considerations. > >No, you misunderstood me. I didn't mean at the mtd driver level but >recursively during the search using the same window. I agree that if >they are doing what YOU said this is pretty cheesy, but I don't think >Franz would consider this. true, but if he didn't, my approach is actually far easier than mtd(f) as you don't change the basic search code, evaluation, hashing, or anything else, just some data structures and the thread code. But the basic search engine itself needs only a couple of lines added and it is done, then you write some new code that is really pretty easy (but, as I said, the data structures are pretty complex, but they are available in Crafty already.) > > >>>Another possibility for easy implementation is mtd(f) with >>>young brothers wait, then spawn off siblings in parallel with >>>no aborting. Not having to worry about re-searching must also >>>be a desirable characteristic for a quick and dirty >>>implementation and another reason they might have considered >>>mtd(f). >>> >> >> >>If they had done this, they could have used the code directly from Crafty >>as it quite easily does this, although the data structures are on the >>complex side. But the statement "we chose mtd(f) rather than something >>more clever like Bob did" seems to say they don't split *inside* the tree >>at all... > >I haven't looked at your implementation but I understand how this >might be implied. In view of all of this, I will change my >"educated guess" to be that they spawn off all work in parallel >at the root level of the tree. The driver is serial and they >probably do not abort work. This would be the simplest >implementation and not ridiculous with zero width searches. this was our *first cut* in 1983 with Cray Blitz (non mtd() of course). But it has a terrible performance limit. If you change your mind multiple times at the root, this can work well. If you stick with one move, you have a real upper bound on parallel speedup that is in the 1.5 range on a good day, and using null-move like fritz, is more like 1.1 or less... better branching factor means this works worse and worse. > >Of course your idea of doing mtd(f) probes in parallel is a >possibility too and if they were REALLY desparate they might >try this since I agree it is the simplest of all. > >There is a much simpler algorithm however but its fairly clear >they are not using it. You can start each processor running >a full search, without worrying about scheduling issues. >Instead, you simply let the hash table do the scheduling work. >You store a counter, for each node of how many processors are looking >at a given node and have each processor order their moves based >on these counters. I'm not even sure this is necessary if your >moves are ordered somewhat randomly, at least the weaker moves >at the end of the list. I'm not sure this algorithm would be >as effective in a distributed processor/memory environment. > >But I have always believed these issues to be fairly trivial, I >think 95% of the implemenation of a parallel program is how to >manage multiple states, 1 for each processor. If your >program updates a global chess board with accompaning data >structures then reworking it to deal with this must be by far >the most tedious part. I can easily imagine Franz spending most >of his time implementing and debugging this. The only really >difficult issue left is how to abort a parallel search, which >he may not have bothered with. I'm not even sure if this is >that difficult as I've never had to do it (I just call "cilk_abort" >in Cilkchess which aborts all work currently in progress that was >spawned by the current function.) > I keep a "threaded list" showing who is working for whom, at which node, so that I can abort everybody that is helping me at a place where I discover that I don't need their results due to a fail-high. >When Aske Plaat implemented a parallel version of Crafty, he >spent almost all of his time struggling with all the global >state Crafty carries around because Crafty was a serial program. >I don't think he had any problems with the scheduling algorithms >or actual parallel stuff. Of course Aske is an expert in this >regard. > I spent two weeks turning all the global tree state into something accessed via a pointer, and then making node counts match between the two on all of the win at chess positions, as a sanity check. Was the hardest part of the project, although handling the recursive search issues so that any processor can help any other processor (no master/slave relationship anywhere in crafty, everything is peer-to-peer in the search) was a pain to get right. >I guess the bottom line is that we can only speculate. I suspect >we both envision different approaches he may be using based on >our own experiences with various algorithms like mtd(f) and >parallel searches. > >- Don I almost suspect this might have been a press-release issue more than anything else. I don't see any way to "quickly" write a parallel search for a highly optimized hand-coded piece of assembler code.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.