Author: Don Dailey
Date: 14:15:15 06/22/98
Go up one level in this thread
On June 22, 1998 at 14:47:35, Robert Hyatt wrote: >On June 22, 1998 at 13:29:31, Don Dailey wrote: > >>On June 22, 1998 at 11:26:42, Robert Hyatt wrote: >> >>>On June 22, 1998 at 10:50:01, Don Dailey wrote: >>> >>>>On June 22, 1998 at 09:51:27, Robert Hyatt wrote: >>>> >>>>>On June 22, 1998 at 02:37:40, Roland Pfister wrote: >>>>> >>>>>>I was on saturday at the townhall to watch Anand - Fritz and perhaps >>>>>>meet some computer chess freaks. >>>>>> >>>>>>But the only persons I knew were the two Matthiasses from ChessBase >>>>>>and Frederic Friedel. Frederic was busy photgraphing or videoing, so >>>>>>I had a small talk with Matthias W. and he told me that Fritz played the >>>>>>open too, at that moment having 5 points out of 5. Not bad. >>>>>> >>>>>>Then Mathias F. arrived, he was the one to operate Fritz. I asked him >>>>>>how they did parallize Fritz and he answered "MTDF", because they >>>>>>did not have the time to do something clever like Bob. >>>>> >>>>>this sounds like they did something very ugly, like searching a different >>>>>mtd(f) on each processor? But in any case, if they used mtd(f), then we >>>>>didn't see the "real fritz" play chess here, because mtd(f) is also not >>>>>quick and dirty to implement. It requires lots of algorithm changes, it >>>>>*definitely* requires significant hash-table changes, and so forth. Seems >>>>>like they were too interested in going faster, but overlooked difficulties >>>>>in getting stronger. >>>> >>>>I think mtd(f) is a pretty logical choice for a parallel machine >>>>if you were in a big hurry. I think the key thing that they must >>>>have considered was how clean and simple the algorithm is and >>>> >>> >>>However, remember that I played with this too, and there are plenty of >>>issues to keep someone busy for several weeks to get it right. And if it >>>is not "right" it will be slower than one processor using normal PVS. >> >>There are two issues, getting it perfect, and getting a usable >>implementation up and running. Getting it perfect is something >>I have yet to do in Cilkchess and we've both talked about what a >>pain in the butt this is. But the basic algorithm is pretty >>easy to implement correctly (simpler than PVS) and I assert that >>this is easily good enough, and that the other problems can be >>ignored if you are in a hurry. The hash tables will require a >>minor change since there is only a single bound. Our hash table >>in Cilkchess is pretty much the same other than this, but if you are >>aware of some refinements to the hash tables to make them more >>compatible with mtd(f) I would like to check them out. > >as I reported months ago, I got better results adding another word to >the table to store both bounds, so that when I hit on one side of the >score one iteration, and then flip to the other side, I won't lose all >the bounds from the first search, in case I get back on that side of >the true score again... wasn't a huge savings, but it certainly made a >difference > > >> >>The tough issues in mtd are strange PV's, and in my opinion lazy >>evaluation. Not dealing with these things will make your program >>a little weaker, but it's nothing to worked up over if you are >>trying to get something up and running fast. mtd's extra speed >>may make this a wash anyway, depending on how much your program >>relies on lazy evaluation (and also how much your program benefits >>from mtd, which I believe can vary a lot from program to program.) > >when testing, I finally had to turn lazy eval off. It was killing me >by requiring far too many re-searches, since the eval would slowly drop >as the lower bound dropped... or vice versa.. and that was no good. > > >> >>You may remember that I posted earlier that I believed certain types >>of programs may be much better suited to mtd than others. I don't >>know where Fritz fits in to all of this. > > >the *huge* issue I'd see is assembly language. I did this in Cray Blitz. >And it took months to debug everything, since all the data structures were >optimized for max speed, and not for parallel threads. > > >> >> >>>the zero width window searches which is ideal for parallel chess. >>>> >>>>Yes, there are side affects that eventually create lots of >>>>implementation work later but these are not so serious that >>>>cannot be ignored (if you are in a big hurry.) >>>> >>>>I'm guessing that they are not doing what you conjectured, but are >>>>simply spawning off sibling searches in parallel with of course >>>>mtd's zero width window. I wonder if they have abort in the >>>>code or simply wait? I can imagine these considerations might >>>>very well push you to consider mtd(f) in a parallel program. >>>> >>> >>>that's what I conjectured, in fact, that they are doing the mtd(f) >>>iterations at the root in parallel with each processor searching a >>>different "window". Doesn't sound particularly good, however, based on >>>normal performance considerations. >> >>No, you misunderstood me. I didn't mean at the mtd driver level but >>recursively during the search using the same window. I agree that if >>they are doing what YOU said this is pretty cheesy, but I don't think >>Franz would consider this. > > >true, but if he didn't, my approach is actually far easier than mtd(f) as >you don't change the basic search code, evaluation, hashing, or anything else, >just some data structures and the thread code. But the basic search engine >itself needs only a couple of lines added and it is done, then you write some >new code that is really pretty easy (but, as I said, the data structures are >pretty complex, but they are available in Crafty already.) > >> >> >>>>Another possibility for easy implementation is mtd(f) with >>>>young brothers wait, then spawn off siblings in parallel with >>>>no aborting. Not having to worry about re-searching must also >>>>be a desirable characteristic for a quick and dirty >>>>implementation and another reason they might have considered >>>>mtd(f). >>>> >>> >>> >>>If they had done this, they could have used the code directly from Crafty >>>as it quite easily does this, although the data structures are on the >>>complex side. But the statement "we chose mtd(f) rather than something >>>more clever like Bob did" seems to say they don't split *inside* the tree >>>at all... >> >>I haven't looked at your implementation but I understand how this >>might be implied. In view of all of this, I will change my >>"educated guess" to be that they spawn off all work in parallel >>at the root level of the tree. The driver is serial and they >>probably do not abort work. This would be the simplest >>implementation and not ridiculous with zero width searches. > > >this was our *first cut* in 1983 with Cray Blitz (non mtd() of course). But >it has a terrible performance limit. If you change your mind multiple times >at the root, this can work well. If you stick with one move, you have a real >upper bound on parallel speedup that is in the 1.5 range on a good day, and >using null-move like fritz, is more like 1.1 or less... > >better branching factor means this works worse and worse. > > >> >>Of course your idea of doing mtd(f) probes in parallel is a >>possibility too and if they were REALLY desparate they might >>try this since I agree it is the simplest of all. >> >>There is a much simpler algorithm however but its fairly clear >>they are not using it. You can start each processor running >>a full search, without worrying about scheduling issues. >>Instead, you simply let the hash table do the scheduling work. >>You store a counter, for each node of how many processors are looking >>at a given node and have each processor order their moves based >>on these counters. I'm not even sure this is necessary if your >>moves are ordered somewhat randomly, at least the weaker moves >>at the end of the list. I'm not sure this algorithm would be >>as effective in a distributed processor/memory environment. >> >>But I have always believed these issues to be fairly trivial, I >>think 95% of the implemenation of a parallel program is how to >>manage multiple states, 1 for each processor. If your >>program updates a global chess board with accompaning data >>structures then reworking it to deal with this must be by far >>the most tedious part. I can easily imagine Franz spending most >>of his time implementing and debugging this. The only really >>difficult issue left is how to abort a parallel search, which >>he may not have bothered with. I'm not even sure if this is >>that difficult as I've never had to do it (I just call "cilk_abort" >>in Cilkchess which aborts all work currently in progress that was >>spawned by the current function.) >> > >I keep a "threaded list" showing who is working for whom, at which node, so >that I can abort everybody that is helping me at a place where I discover that >I don't need their results due to a fail-high. > > >>When Aske Plaat implemented a parallel version of Crafty, he >>spent almost all of his time struggling with all the global >>state Crafty carries around because Crafty was a serial program. >>I don't think he had any problems with the scheduling algorithms >>or actual parallel stuff. Of course Aske is an expert in this >>regard. >> > > >I spent two weeks turning all the global tree state into something accessed >via a pointer, and then making node counts match between the two on all of the >win at chess positions, as a sanity check. Was the hardest part of the project, >although handling the recursive search issues so that any processor can help any >other processor (no master/slave relationship anywhere in crafty, everything is >peer-to-peer in the search) was a pain to get right. > > > >>I guess the bottom line is that we can only speculate. I suspect >>we both envision different approaches he may be using based on >>our own experiences with various algorithms like mtd(f) and >>parallel searches. >> >>- Don > > >I almost suspect this might have been a press-release issue more than anything >else. I don't see any way to "quickly" write a parallel search for a highly >optimized hand-coded piece of assembler code. Agreed! The vision of Franz having a few hours to kludge something together quickly is a little hard to swallow. - Don
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.