Author: Uri Blass
Date: 23:47:20 09/03/02
Go up one level in this thread
On September 03, 2002 at 21:47:01, Robert Hyatt wrote: >On September 03, 2002 at 20:29:19, Rolf Tueschen wrote: > >>On September 03, 2002 at 17:30:35, Robert Hyatt wrote: >> >>>As is usually the case, someone that helped me with this had sent an email >>>while I was responding to the other posts. And when I read the second >>>paragraph, it all "came back". >>> >>>Here is the issue: >>> >>>I started with a 16 cpu game log file. Note that this was from a real >>>game. And in it I would find output just like Crafty's... Here is the >>>idea: >>> >>> depth time eval PV >>> >>>followed by a summary. >>> >>>The problem is that the node count in the summary has nothing to do with the >>>PV when it was displayed. The program _could_ have stopped the search as soon >>>as the PV was displayed, or it could have stopped the search minutes later. >>>As a result, I had no real node counts for the 16 cpu test that could be >>>compared to anything else since there was no way to know when the 16 cpu >>>test completed. >>> >>>We chose to do the following: >>> >>>1. run the positions thru a one processor search, and since there was no >>>parallel searching going on, we could display an _exact_ node count for the >>>one-processor test, as it would have been had the search stopped immediately >>>after producing the critical PV move at the final depth. That value _is_ a >>>raw data point. >>> >>>2. We then ran the positions thru the 2-processor search, taking the time >>>for the same PV as the time. All the times are pure raw data, exactly. But >>>we couldn't get a good node count. What we chose to do was to use an internal >>>performance monitor we had built in, that very precisely told us how much cpu >>>time had been spent playing chess by each processor. From these times, we >>>computed speedups for 2 processors, 4, 8 and 16 (we didn't run the 16 cpu test >>>again, we just used the raw log from the mchess pro game... >>> >>>3. We now had a set of speedups for each test. Which we plugged into the >>>article. And again, it is important to note that for this data, the raw >>>speedup was computed by dividing the times as you would expect. >>> >>>For the node counts, which was impossible for us to obtain from any but the >>>one processor test, we simply extrapolated them based on the cpu utilization >>>of all the processors. Some simple testing by searching to a fixed depth on >>>one processor and then 16 processors shows that our "extrapolation" was "right >>>on"... and we used those node counts. >>> >>>4. Clearly, the node counts are therefore produced from the raw 1-cpu data, >>>multiplied by the percent of cpu utilization for the 2,4,8 and 16 cpu test >>>cases. So they should correlate 100%. >>> >>>The only thing that my (nameless) partner said was that he could not remember >>>if we did the same thing to produce the times since it would have been easier >>>than trying to extract them from the logs later to produce the table for times. >>>He "thought" that the times were added after a request from a referee, so that >>>is possible. >>> >>>So, perhaps the data has some questionable aspects to it. The only part that >>>I am _certain_ is "raw data" is the individual speedup values, because that is >>>what we were looking at specifically. I had not remembered the node count >>>problem until this email came in and then I remembered a case where Vincent >>>was trying to prove something about crafty and got node counts suggesting that >>>it should have gotten a > 2.0 speedup. I had pointed out that the way I do >>>nodes, it is impossible to produce them anywhere except when all processors are >>>idle, if you want an accurate number. I _should_ have remembered that we had >>>the same problem back then. I am therefore afraid that the times might have >>>been computed in the same way since it would have been quite natural to do >>>so... >>> >>>I don't think this changes one iota about what is going on, of course. as >>>given a speedup, and total time used by Crafty, I can certainly compute a >>>node count that will be _very_ close to the real one. Which I supposed I should >>>add so that Vincent can have his "every time the PV changes give me nodes" >>>type of value. >>> >>>Keep in mind that this was an email from someone that worked on this with me >>>back then. His memory was somewhat better because he actually wrote the code >>>to solve the problem. But again, he was _very_ vague in remembering everything. >>>It took a phone call for us to discuss this to get as far as I did above. I >>>might remember more as time goes on. >>> >>>But the bottom line is "trust the speedup numbers explicitly". And if you >>>trust them, the others can be directly derived from them. For 16 cpus, Cray >>>Blitz generally searched 100% of the time on each cpu. If it produced a speedup >>>of 16, then each cpu searched 1/16th the total nodes searched by one processor. >>>If it produced a speedup of 8, then each cpu searched 1/8 of the nodes searched >>>by one processor, which is 2x the total nodes, aka search overhead. >>> >>>Sorry for the confusion. Stuff done 10 years ago is difficult enough. >>>Remembering the "log eater" was harder since I didn't write all of it... >>> >>>Bob >> >>Bob, >>would you be so kind to explain from your standpoint (and also in the light of >>the adding here) how Vincent's conclusions could be understood? > > >What he finally said was that either the nodes or times (I don't remember >which) are perfect multiples of the speedup numbers. IE a few of the >results where CB said 6.1 speedup showed that either the nodes or the >times (one of them or both for all I know) was exactly 6.1X the one cpu >result, which would be unlikely to happen more than a few times. And he >was correct. > >Unfortunately, he never asked about that at all, in private, otherwise I >could probably have come up with the right answer. But he chose to make >a public fuss about it so that his sponsor will not look at the Cray Blitz >speedup as "real"... so that his speedup (which is quite obviously well below >11.1 on 16 cpus) will not look so bad to them. I here agree with GCP If Vincent's target was to convince the sponsor not to look at the speedup of crayblitz as real he probably suceeded. He does not need to prove that the results of the speed up are a lie but only to convince them not to trust the results. The times and not the speedup are the important information. Times are calculated first and speedup is calculated only later after knowing the times. Usually we tend to trust scientists but if the information about times is wrong then it means that we cannot trust the other details in the article. Uri
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.