Author: Robert Hyatt
Date: 09:07:50 01/15/06
Go up one level in this thread
On January 14, 2006 at 17:04:12, George Sobala wrote: >On January 14, 2006 at 00:21:33, Robert Hyatt wrote: > >>I'm not sure (a) what you mean by "my experience with deep sjeng"; > >Oops. I thought I was replying to GCP. Silly me. > > >>and (b) what >>the rest has to do with my comment. So what if one position is far faster, and >>another is slower? If I average going a ply deeper, I'm going to play stronger. >> You can see the _same_ effect with one processor. Double the speed and >>sometimes you get another ply, sometimes nothing because the search explodes. >>But over the long-run, if a dual-processor program runs 1.7X faster, it will be >>about equal to a single CPU that is 1.7X faster than the original box as well. >> >>Even if a program were to be so erratic that the SMP search goes one ply deeper >>three moves, and one ply shallower one move, I'd still rather have that program >>because the average depth is deeper, and it will play stronger overall. The >>"noise" in the non-deterministic search doesn't change the overall result. >> >>So _overall_ (speaking only for my program) two processors is about 1.7X faster >>overall. And it is stronger overall. About as strong as if you somehow >>overclocked the original processor 70% to get it to 1.7X faster... > >I am sure you are correct that in terms of ply-depth, "over the long-run, if a >dual-processor program runs 1.7X faster, it will be about equal to a single CPU >that is 1.7X faster than the original box as well." You have the researched data >to back this statement up. > >But it won't be the SAME. If by "same" you mean "identical" then no, it won't be the same. Running the _same_ position N times with a parallel search will at times produce strange results. Some positions are 100% repeatable. Same time taken each time. Same score. Same PV. Same everything. Others are rarely repeatable. Different time taken almost every time, and it can vary wildly as well, 30 seconds one time, 10 minutes the next. Different best moves. Different scores. Different PVs. Most are somewhere in the middle, closer to repeatable than not, although the time might vary 10% or more. But then you can do this same thing by just running the same position N times using one processor, and vary nothing but hash size. Check out Fine #70 as a good test case. It is highly dependent on hash table data, and a parallel search is overwriting hash table information in different ways each time you run it, due to the variability of timing between the two (or more) threads/processes. > >The "noise" (chaos) in the non-deterministic SMP search makes the chess >performance of the SMP program rather erratic compared to its fast >single-processor equivalent. Now maybe Crafty's SMP chaos is much less than Deep >Shredder's, but certainly for what I have seen, DS running on a 4x2.5GHz quad on >some occasions may find moves that a single processor Shredder at 40GHz would >fail to find. Conversely, sometimes it will fail to find the move that a 8-10GHz >single processor would find. Now your argument is that these two effects cancel >each other out in the long run, and in any case the "new" move found may >actually be worse rather than better. True. But what we don't know is whether >they cancel out *exactly*, or whether the SMP program is slightly advantaged or >disadvantaged relative to its fast single-processor ply-depth equivalent. It >would take an awful lot of games to demonstrate e.g. a 20-30 ELO advantage or >disadvantage: do you have such data? There is an easy way to measure this, by heads-up play of a large number of games. I have done this with great frequency in the past, because it helps to find bugs in the parallel search if they are there. My draw-score bug from a couple of years back was an example. I'd play the 4cpu version against a single-cpu version using hardware about 3x as fast as one of the 4 cpu processors, for a pretty equal performance level. I was looking for cases where the four cpu version saw draws that the one cpu version did not, or vice-versa. Match results were pretty equal once the draw score bug was found. Initially the single-cpu program had a slight edge over the other, but then the SMP program was overlooking draws when it should have won, and let the single-cpu program escape with a draw. The single-cpu program won the games it should have won since it didn't stumble into a draw by error. Once that was fixed, things seemed to be pretty equal as expected. This would not be hard to test without resorting to SMP. Play a match between two non-SMP programs, with one having hardware 1.7x faster than the other. Then repeat, but now randomly give the faster hardware program more time for a move, then less time on a different move, to simulate the "variability" and see what happens. Has to be done ponder=off since times are not matched, obviously. It would seem to me that if I really see 1.7x faster search on average, I am going to play better. Even if it means that 3 moves are faster, one is slower. The "slow move" has to happen at a point where it causes a mistake, while the faster searches have to happen at points where they can't spot some tactical shot a slower search (opponent) would miss. That seems random enough that it ought to average out and the faster program overall should prevail. > >My guess is that the supralinear speedups that sometimes occur may give the SMP >a slight edge overall. But that is just speculation. I don't think so myself. Because they are uncommon, unless the program actually has bugs. super-linear happens, but then again, "zero-speedup" will happen as well, and they ought to offset since they come at unpredictable times.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.