Author: Vincent Diepeveen
Date: 06:04:43 09/06/02
Go up one level in this thread
On September 05, 2002 at 11:52:20, Robert Hyatt wrote: i don't care for icc. I hope you realize if tree shape of 2 processors is the same and overlap is better of the tree, that speedup is better. That's very trivial. It is the reason why searching deeper tends to give a better speedup for DTS. Because the hashentries give you directly a similar shape then. Or do you deny this? >On September 05, 2002 at 10:20:53, Vincent Diepeveen wrote: > >>On September 04, 2002 at 18:38:17, Dann Corbit wrote: >> >>My posting was with regard to the DTS article. >> >>The crafty matter is something different. I say crafty >>is not prepared for the future there, because it relies too >>much upon 4 things for its speedup SMP >> - memory/communication speed > >So? > >> - random splitting instead of chosen like in DTS > >I chose to accept that shortcoming to keep the recursive search. I >_might_ rewrite the search to a non-recursive form one day, but it is >not a high priority at the moment. > > >> - asymmetric king safety (hard penalties for the own king side >> are very good for the speedup), which is good for speedup and >> bad for play, not used by majority of programs in program-program. >> The existance of this feature is the ultimate proof on how bad >> Crafty's king safety is. > >Now you are off into never-never land. Asymmetric eval helps speedups? >What a crock. You should re-phrase that. _everything_ I do helps the >speedup and makes the results invalid, so that your speedups won't look >so pitiful? My test setup helps speedups. Pondering helps speedups. >Asymmetry helps speedups. Slow processors help speedups. You flap so >fast you might actually fly one day. As far as "the ultimate proof" I >challenge you to play me any day, any time, on ICC and _demonstrate_ how >bad my king safety is. Of course, as I have mentioned, with you having >somewhat better hardware you are still losing 2 of every 3 games. So that >"terrible king safety" in my program suggests that if mine sucks, yours sucks >with two straws... > > > > > > > >> - that crafty would for having n processors a speedup of >> 1 + 0.7(n-1) > > >That was a formula I derived several years ago right after getting the >current parallel algorithm up and running. I posted the data for my quad >pentium pro, and discussed the 30% loss for each additional processor due >to extra nodes searched... So you here mention the limit in advance is already for 4 processors at most? Instead of the suggestion which was created in ICCA that it was true for n being big numbers too in the article about crafty. >I think it is still pretty accurate. But even if it were to be only >speedup = 1 + .6 * (NCPUS - 1) that is linear and acceptable... That fits >GCP's test run on the DTS positions. To fit mine it needs to be more like >speedup = 1 + .68 * (NCPUS -1) as I got 3.0 on the results I sent you and he. So proof for 1-4 processors is also proof for 32 processors you state here? Despite having a central smp_lock which locks for *everything* when splitting or abortfailhighs ( stopthread() in crafty ). So all processors hammering into smp_lock while copying 3 KB of data which is a hell of a lot of clocks when compared to how little clocks 1 node costs, and it is random splitting so you *keep* splitting near the leafs initially. Despite all that you claim a constant formula for n processors without limiting it to 4? >Notice that constant multiplier is changing? It will _always_ change if you >try to fit it to a specific set of positions. There is no "absolute speedup >formula"... >>Bob doesn't show a 1.7 speedup at all. He shows 1 position where >>every algorithm as historic has been proven (bk22) gets a good speedup. >Vincent, why don't you search the CCC archives. Several years ago I >posted this data, which was based _not_ on just one position, but on >the entire kopec 24 position test set. >I have posted _other_ results. I gave you the log files for a 4 cpu >run against the Cray Blitz DTS positions that produced a 3.0 speedup. >Why you would keep making statements that are outright (and provably >so) lies is simply beyond me... You claimed it was 3.1 here a few days ago. Now it's 3.0 then it's 3.1 I can't agree with any of your experiments anyway, because with exception of crafty, there is not a single proof regarding Cray Blitz. Do you have the source code of cray blitz still, and if not, did you have it start of this year? >I posted a position where crafty delivered a perfect speedup every >time. To martin last night. I posted another position where the speedup >was all over the place. I never posted officially in ICCA journal yet, but the fraud you commit in the articles there i for sure won't do. >You _never_ post any reasonable speedup data. You just wave your hands, >and "proof" that diep is the best there is in parallel search. What about >some data from you? What about some data from someone running your program >that is not you, since I am not sure we can trust your data. Anybody can >verify mine, since the program is public... > >You say I can't get a NPS improvement of 1.9x with 2 cpus. Yet I did, >and several duals got real close, all being above 1.8. Far better than >your claimed 1.4 or 1.0 for me... you said "huge slowdown". 2.0 is optimal. >1.9 is a "huge slowdown". > >Does the word "ridiculous" mean anything in the context of statements you >make all the time??? > > >> >>The last point is very serious. >> >>GCP ran 30 positions at bob's quad xeon and had an average speedup of 2.8 >>at it with crafty. >> >>That is not near 3.1 which is claimed according to the formula, >>and it is measured very accurately over many positions. > >And did I not run the same 24 positions on my quad 700 and get 3.0? Did >I not give you the raw data log? So his 24 positions at 2.8 is "very accurate" >while my 3.0 over the _same_ 24 positions is not? > >That's the kind of scientific reasoning I like... > > > > >> >>Bob also received the outputs of it. and still has them. > > >I sure do. They are just one more data point. If I run it enough times >I'm pretty sure I will get numbers over 3.0 also. But whether the speedup >for that set of positions averages at 2.8 or 3.5 doesn't matter. That doesn't >mean the speedup over a larger set of positions won't be different, as I have >said _many_ times. > >You say I can't produce 1.7 on two. Should we pick a neutral person with a >good dual (Eugene comes to mind) and let him run the test? I doubt he would >because then he would be subject to your "fraud" nonsense if he produces numbers >that don't agree with "your reality". > >> >>>My take on the matter (in one paragraph): >>>Robert wrote a paper on parallel speedup, showing a 1.7 increase for 2 CPU's (as >>>derived from his more general formula). Vincent was unable to reproduce this >>>sort of speedup, and thought the research was faulty. Robert agreed that the >>>test set was limited and you won't always get that sort of speedup, but as an >>>average (over a broad set of positions) that's about what he got. There has >>>been some acrimony over whether superlinear speedups are possible. I think that >>>the jury is still out on that one. >>> >>>At any rate, that's my take on the whole thing. >>> >>>Vincent always sees things in pure, jet black or gleaming, powder white. If >>>something isn't terrific, then it is pure junk. While I think his mode of >>>interesting is a bit odd, it's one of the things that make Vincent interesting. >>> >>>Robert has always been a man of strong convictions, and if you call him a >>>'noo-noo head' he'll call you one back. He isn't one to back down when he >>>thinks he is right. That's one of the things I like about Dr. Hyatt. >>> >>>When these two styles happen to ram into one another, the sparks are sure. A >>>philosophical question is often asked: >>>"What happens when an immovable object meets an irresistable force?" >>> >>>The 'debate' is an answer to that question. >>>;-)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.