Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: More on the "bad math" after an important email...

Author: Georg v. Zimmermann

Date: 14:49:33 09/03/02

Go up one level in this thread


Hallo,

I am afraid I have to say i dont like this at all. IMHO it is a very bad idea to
include "total nodes searched" that are not total nodes searched in a scientific
article.
I do not see why you did not just include speedup factors then.

And could you ellaborate on why you couldnt do node counts ? I do know next to
nothing about multi-processor search, so I dont understand why you cant simply
do positionsSearched[processor]++; and later add them all up, or something
similar. AFAIK all other Deep Something programs report node counts even when
they dont get 100% cpu time ?

Finally I would like to say that I appreciate Dr.Hyatts work a lot, and however
questionable this one article might be - or might not be, not all that gold is
glitters - I would probably never had as much fun with computer chess without
him helping the guys who helped me understand things. :)

Georg




On September 03, 2002 at 17:30:35, Robert Hyatt wrote:

>As is usually the case, someone that helped me with this had sent an email
>while I was responding to the other posts.  And when I read the second
>paragraph, it all "came back".
>
>Here is the issue:
>
>I started with a 16 cpu game log file.  Note that this was from a real
>game.  And in it I would find output just like Crafty's...  Here is the
>idea:
>
>      depth    time    eval    PV
>
>followed by a summary.
>
>The problem is that the node count in the summary has nothing to do with the
>PV when it was displayed.  The program _could_ have stopped the search as soon
>as the PV was displayed, or it could have stopped the search minutes later.
>As a result, I had no real node counts for the 16 cpu test that could be
>compared to anything else since there was no way to know when the 16 cpu
>test completed.
>
>We chose to do the following:
>
>1.  run the positions thru a one processor search, and since there was no
>parallel searching going on, we could display an _exact_ node count for the
>one-processor test, as it would have been had the search stopped immediately
>after producing the critical PV move at the final depth.  That value _is_ a
>raw data point.
>
>2.  We then ran the positions thru the 2-processor search, taking the time
>for the same PV as the time.  All the times are pure raw data, exactly.  But
>we couldn't get a good node count.  What we chose to do was to use an internal
>performance monitor we had built in, that very precisely told us how much cpu
>time had been spent playing chess by each processor.  From these times, we
>computed speedups for 2 processors, 4, 8 and 16 (we didn't run the 16 cpu test
>again, we just used the raw log from the mchess pro game...
>
>3.  We now had a set of speedups for each test.  Which we plugged into the
>article.  And again, it is important to note that for this data, the raw
>speedup was computed by dividing the times as you would expect.
>
>For the node counts, which was impossible for us to obtain from any but the
>one processor test, we simply extrapolated them based on the cpu utilization
>of all the processors.  Some simple testing by searching to a fixed depth on
>one processor and then 16 processors shows that our "extrapolation" was "right
>on"...  and we used those node counts.
>
>4.  Clearly, the node counts are therefore produced from the raw 1-cpu data,
>multiplied by the percent of cpu utilization for the 2,4,8 and 16 cpu test
>cases.  So they should correlate 100%.
>
>The only thing that my (nameless) partner said was that he could not remember
>if we did the same thing to produce the times since it would have been easier
>than trying to extract them from the logs later to produce the table for times.
>He "thought" that the times were added after a request from a referee, so that
>is possible.
>
>So, perhaps the data has some questionable aspects to it.  The only part that
>I am _certain_ is "raw data" is the individual speedup values, because that is
>what we were looking at specifically.  I had not remembered the node count
>problem until this email came in and then I remembered a case where Vincent
>was trying to prove something about crafty and got node counts suggesting that
>it should have gotten a > 2.0 speedup.  I had pointed out that the way I do
>nodes, it is impossible to produce them anywhere except when all processors are
>idle, if you want an accurate number.  I _should_ have remembered that we had
>the same problem back then.  I am therefore afraid that the times might have
>been computed in the same way since it would have been quite natural to do
>so...
>
>I don't think this changes one iota about what is going on, of course.  as
>given a speedup, and total time used by Crafty, I can certainly compute a
>node count that will be _very_ close to the real one.  Which I supposed I should
>add so that Vincent can have his "every time the PV changes give me nodes"
>type of value.
>
>Keep in mind that this was an email from someone that worked on this with me
>back then.  His memory was somewhat better because he actually wrote the code
>to solve the problem.  But again, he was _very_ vague in remembering everything.
>It took a phone call for us to discuss this to get as far as I did above.  I
>might remember more as time goes on.
>
>But the bottom line is "trust the speedup numbers explicitly".  And if you
>trust them, the others can be directly derived from them.  For 16 cpus, Cray
>Blitz generally searched 100% of the time on each cpu.  If it produced a speedup
>of 16, then each cpu searched 1/16th the total nodes searched by one processor.
>If it produced a speedup of 8, then each cpu searched 1/8 of the nodes searched
>by one processor, which is 2x the total nodes, aka search overhead.
>
>Sorry for the confusion.  Stuff done 10 years ago is difficult enough.
>Remembering the "log eater" was harder since I didn't write all of it...
>
>Bob



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.