Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Deep Shredder (2 processors) x Rybka ?

Author: Robert Hyatt

Date: 09:07:50 01/15/06

Go up one level in this thread


On January 14, 2006 at 17:04:12, George Sobala wrote:

>On January 14, 2006 at 00:21:33, Robert Hyatt wrote:
>
>>I'm not sure (a) what you mean by "my experience with deep sjeng";
>
>Oops. I thought I was replying to GCP. Silly me.
>
>
>>and (b) what
>>the rest has to do with my comment.  So what if one position is far faster, and
>>another is slower?  If I average going a ply deeper, I'm going to play stronger.
>> You can see the _same_ effect with one processor.  Double the speed and
>>sometimes you get another ply, sometimes nothing because the search explodes.
>>But over the long-run, if a dual-processor program runs 1.7X faster, it will be
>>about equal to a single CPU that is 1.7X faster than the original box as well.
>>
>>Even if a program were to be so erratic that the SMP search goes one ply deeper
>>three moves, and one ply shallower one move, I'd still rather have that program
>>because the average depth is deeper, and it will play stronger overall.  The
>>"noise" in the non-deterministic search doesn't change the overall result.
>>
>>So _overall_ (speaking only for my program) two processors is about 1.7X faster
>>overall.  And it is stronger overall.  About as strong as if you somehow
>>overclocked the original processor 70% to get it to 1.7X faster...
>
>I am sure you are correct that in terms of ply-depth, "over the long-run, if a
>dual-processor program runs 1.7X faster, it will be about equal to a single CPU
>that is 1.7X faster than the original box as well." You have the researched data
>to back this statement up.
>
>But it won't be the SAME.



If by "same" you mean "identical" then no, it won't be the same.  Running the
_same_ position N times with a parallel search will at times produce strange
results.

Some positions are 100% repeatable.  Same time taken each time. Same score.
Same PV.  Same everything.

Others are rarely repeatable.  Different time taken almost every time, and it
can vary wildly as well, 30 seconds one time, 10 minutes the next.  Different
best moves.  Different scores.  Different PVs.

Most are somewhere in the middle, closer to repeatable than not, although the
time might vary 10% or more.

But then you can do this same thing by just running the same position N times
using one processor, and vary nothing but hash size.  Check out Fine #70 as a
good test case.  It is highly dependent on hash table data, and a parallel
search is overwriting hash table information in different ways each time you run
it, due to the variability of timing between the two (or more)
threads/processes.



>
>The "noise" (chaos) in the non-deterministic SMP search makes the chess
>performance of the SMP program rather erratic compared to its fast
>single-processor equivalent. Now maybe Crafty's SMP chaos is much less than Deep
>Shredder's, but certainly for what I have seen, DS running on a 4x2.5GHz quad on
>some occasions may find moves that a single processor Shredder at 40GHz would
>fail to find. Conversely, sometimes it will fail to find the move that a 8-10GHz
>single processor would find. Now your argument is that these two effects cancel
>each other out in the long run, and in any case the "new" move found may
>actually be worse rather than better. True. But what we don't know is whether
>they cancel out *exactly*, or whether the SMP program is slightly advantaged or
>disadvantaged relative to its fast single-processor ply-depth equivalent. It
>would take an awful lot of games to demonstrate e.g. a 20-30 ELO advantage or
>disadvantage: do you have such data?

There is an easy way to measure this, by heads-up play of a large number of
games.  I have done this with great frequency in the past, because it helps to
find bugs in the parallel search if they are there.  My draw-score bug from a
couple of years back was an example.  I'd play the 4cpu version against a
single-cpu version using hardware about 3x as fast as one of the 4 cpu
processors, for a pretty equal performance level.  I was looking for cases where
the four cpu version saw draws that the one cpu version did not, or vice-versa.
Match results were pretty equal once the draw score bug was found.  Initially
the single-cpu program had a slight edge over the other, but then the SMP
program was overlooking draws when it should have won, and let the single-cpu
program escape with a draw.  The single-cpu program won the games it should have
won since it didn't stumble into a draw by error.  Once that was fixed, things
seemed to be pretty equal as expected.

This would not be hard to test without resorting to SMP.  Play a match between
two non-SMP programs, with one having hardware 1.7x faster than the other.  Then
repeat, but now randomly give the faster hardware program more time for a move,
then less time on a different move, to simulate the "variability" and see what
happens.  Has to be done ponder=off since times are not matched, obviously.  It
would seem to me that if I really see 1.7x faster search on average, I am going
to play better.  Even if it means that 3 moves are faster, one is slower.  The
"slow move" has to happen at a point where it causes a mistake, while the faster
searches have to happen at points where they can't spot some tactical shot a
slower search (opponent) would miss.  That seems random enough that it ought to
average out and the faster program overall should prevail.

>
>My guess is that the supralinear speedups that sometimes occur may give the SMP
>a slight edge overall. But that is just speculation.

I don't think so myself.  Because they are uncommon, unless the program actually
has bugs.  super-linear happens, but then again, "zero-speedup" will happen as
well, and they ought to offset since they come at unpredictable times.




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.