Author: Georg v. Zimmermann
Date: 14:49:33 09/03/02
Go up one level in this thread
Hallo, I am afraid I have to say i dont like this at all. IMHO it is a very bad idea to include "total nodes searched" that are not total nodes searched in a scientific article. I do not see why you did not just include speedup factors then. And could you ellaborate on why you couldnt do node counts ? I do know next to nothing about multi-processor search, so I dont understand why you cant simply do positionsSearched[processor]++; and later add them all up, or something similar. AFAIK all other Deep Something programs report node counts even when they dont get 100% cpu time ? Finally I would like to say that I appreciate Dr.Hyatts work a lot, and however questionable this one article might be - or might not be, not all that gold is glitters - I would probably never had as much fun with computer chess without him helping the guys who helped me understand things. :) Georg On September 03, 2002 at 17:30:35, Robert Hyatt wrote: >As is usually the case, someone that helped me with this had sent an email >while I was responding to the other posts. And when I read the second >paragraph, it all "came back". > >Here is the issue: > >I started with a 16 cpu game log file. Note that this was from a real >game. And in it I would find output just like Crafty's... Here is the >idea: > > depth time eval PV > >followed by a summary. > >The problem is that the node count in the summary has nothing to do with the >PV when it was displayed. The program _could_ have stopped the search as soon >as the PV was displayed, or it could have stopped the search minutes later. >As a result, I had no real node counts for the 16 cpu test that could be >compared to anything else since there was no way to know when the 16 cpu >test completed. > >We chose to do the following: > >1. run the positions thru a one processor search, and since there was no >parallel searching going on, we could display an _exact_ node count for the >one-processor test, as it would have been had the search stopped immediately >after producing the critical PV move at the final depth. That value _is_ a >raw data point. > >2. We then ran the positions thru the 2-processor search, taking the time >for the same PV as the time. All the times are pure raw data, exactly. But >we couldn't get a good node count. What we chose to do was to use an internal >performance monitor we had built in, that very precisely told us how much cpu >time had been spent playing chess by each processor. From these times, we >computed speedups for 2 processors, 4, 8 and 16 (we didn't run the 16 cpu test >again, we just used the raw log from the mchess pro game... > >3. We now had a set of speedups for each test. Which we plugged into the >article. And again, it is important to note that for this data, the raw >speedup was computed by dividing the times as you would expect. > >For the node counts, which was impossible for us to obtain from any but the >one processor test, we simply extrapolated them based on the cpu utilization >of all the processors. Some simple testing by searching to a fixed depth on >one processor and then 16 processors shows that our "extrapolation" was "right >on"... and we used those node counts. > >4. Clearly, the node counts are therefore produced from the raw 1-cpu data, >multiplied by the percent of cpu utilization for the 2,4,8 and 16 cpu test >cases. So they should correlate 100%. > >The only thing that my (nameless) partner said was that he could not remember >if we did the same thing to produce the times since it would have been easier >than trying to extract them from the logs later to produce the table for times. >He "thought" that the times were added after a request from a referee, so that >is possible. > >So, perhaps the data has some questionable aspects to it. The only part that >I am _certain_ is "raw data" is the individual speedup values, because that is >what we were looking at specifically. I had not remembered the node count >problem until this email came in and then I remembered a case where Vincent >was trying to prove something about crafty and got node counts suggesting that >it should have gotten a > 2.0 speedup. I had pointed out that the way I do >nodes, it is impossible to produce them anywhere except when all processors are >idle, if you want an accurate number. I _should_ have remembered that we had >the same problem back then. I am therefore afraid that the times might have >been computed in the same way since it would have been quite natural to do >so... > >I don't think this changes one iota about what is going on, of course. as >given a speedup, and total time used by Crafty, I can certainly compute a >node count that will be _very_ close to the real one. Which I supposed I should >add so that Vincent can have his "every time the PV changes give me nodes" >type of value. > >Keep in mind that this was an email from someone that worked on this with me >back then. His memory was somewhat better because he actually wrote the code >to solve the problem. But again, he was _very_ vague in remembering everything. >It took a phone call for us to discuss this to get as far as I did above. I >might remember more as time goes on. > >But the bottom line is "trust the speedup numbers explicitly". And if you >trust them, the others can be directly derived from them. For 16 cpus, Cray >Blitz generally searched 100% of the time on each cpu. If it produced a speedup >of 16, then each cpu searched 1/16th the total nodes searched by one processor. >If it produced a speedup of 8, then each cpu searched 1/8 of the nodes searched >by one processor, which is 2x the total nodes, aka search overhead. > >Sorry for the confusion. Stuff done 10 years ago is difficult enough. >Remembering the "log eater" was harder since I didn't write all of it... > >Bob
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.