Author: Robert Hyatt
Date: 21:17:26 09/21/01
Go up one level in this thread
On September 21, 2001 at 20:53:37, Vincent Diepeveen wrote: >On September 18, 2001 at 22:11:19, Robert Hyatt wrote: > >>On September 18, 2001 at 20:36:17, Vincent Diepeveen wrote: >> >>> >>>You're not reading what i write and your example sucks, as the >>>effect only happens at say 15 minutes a move, and running 2 processes >>>at 1 cpu is too much overhead, only after 1 hour i get a speedup, >>>which i never get in a game for each move. >> >> >>OK... wait for 3 years for hardware 10x faster. then you can go more than >>1x faster with one cpu. Which is going to be a _real_ trick to prove to >>the rest of the computer science world. >> >> >>> >>>Of course i tested this bigtime! >>> >>>If your read a bit better what i mentionned then it's clear >>>that no programs move ordering is very good in contradiction to some >>>crap statements a few years ago in RGCC when not many except some >>>commercial guys had measured their cutoff rates and their flip rates. >>> >>>>have a bug, or an anomaly position. It is simply not possible. If you >>>>are _really_ doing that, you should modify your single-cpu version as >>>>follows: >>> >>>Basically you also say here that Fritz is having a bug, because fritz >>>has the same like DIEP here, only the difference is that the overhead >>>from fritz is that big that it takes like half a day before a parallel >>>search is giving more than 2.0 speedup. >> >> >>I don't believe it for one second. I haven't even seen Frans claim a >>speedup of 2.00 yet. Last I heard he said "I don't get the same speedup >>as crafty does..." > >There is some hard evidence here. Note that fritz DOES produce the >number of nodes it needs for each ply at the screen, in contradiction >with crafty. > >Perhaps your parallel overhead is simply that huge that you hardly see >that sometimes you need less nodes, and i bet you never compare single >cpu node outputs with dual node outputs. two things. 1. My parallel search has essentially zero overhead. The time it takes me to "split" is so near zero as to be called zero. 2. In everything I have ever written, from my PhD dissertation, through the JICCA paper _always_ compared the single-processor node counts to the multiple-cpu node counts. That is _exactly_ where I get my speedup formula of S = 1 + (N-1) * .7 that lost .3 of every processor is search overhead. > >Of course these effects are hell harder on 4 cpu's as i explained, but >it seems you're not listening to that! > >> >> >>> >>>Ah DANG, more programs have it. >>> >>>I bet you didn't know that!! > >>Nor did you, I'll bet. > >I was pretty amazed when i also saw Deepfritz having this indeed. >The overhead from parallel search is however that big, that it takes >a long time to see it! > >>> >>>Everyone who has spent lots of times >>>into parallellizing a program in a proper way is having this effect. >> >> >> >>Vincent, that is simply _false_. Want to bet I know more "everyones" than >>you do? lets start. Me. Hsu. Newborn. Schaeffer. Waycool group at caltech. > >Your info is usually 20 years old. Like the Cray Blitz info. > >You're one of the pioneers of computer chess, especially parallel >computer chess, let's not forget that, but this is 2001 and others >are being creative now! > >Cray Blitz from 15 years ago would be blown away with a single >P3-800Mhz nowadays, for several reasons > a) progress in algoritms (R=3) > b) the machine is hell slower than nowadays single cpus even are > c) book > d) better testing > e) better tuning > f) more knowledgeable facts are tested into nowadays software > g) less dubious forward pruning by most commercials > >And the list goes on. You had excellent hardware for those days, but >compared with programs from these days and it gets extrapolated to today. You can repeat that crap every weak for the next 30 years. But that will _not_ make it true. R=3 takes about 2 minutes to add to Cray Blitz. The rest is _already_ there... > >Also a major impact for parallel behaviour is that you used R=1 with Cray >blitz in a very limited way, versus nowadays R=3 is getting used and >near tips by some R=2 is getting used (i'm again back on R=3 there too). > >>Warnock/Wendroff. Spracklens. Moreland. Daily. Several amateurs using SMP. >>I have >>heard _nobody_ claim to go more than 2x faster on 2 cpus. It is a simply >>preposterous idea. For the reason I gave. You could use two threads on one > >Ok let's start discussing them. The first 2 guys i never heart of. >Spracklens only programmed at small CPUs AFAIK and they didn't search >very deep *ever*. program called LaChex. From Los Alamos Lab. Participated in several ACM events with a good parallel search. > >Moreland works at a 4 processor machine using recursion and doing >all kind of dubious stuff the last so many plies. Also using singular >extensions and all other kind of extensions. So for sure already turning >on these extensions means chance he gets > 2x speedup is near zero. > His speedup is _definitely_ > 2.0... just ask him (assuming you are meaning 4 processors). If you mean 2 processors, then no sane person claims to get > 2 speedup. >Only if he kicks out all kind of dubious extensions and runs at 2 cpu's >he'll get near. > >Daily is using the much praised cilk language to parallellize. SO HE >DOESN'T CONTROL THE PARALLEL SPEEDUP AT ALL! > >So daily can shake forever getting a 2x speedup. > >More persons to discuss? > >Aha, Rudolf Huber with SOS, Patzer SMP, Shredder. > >Well i'm not sure for Shredder, but for sure the first 2 guys are >using a completely different parallel strategy where *only* profits >can be made by means of hashtable somehow. > >So getting 2.0 with that is very unlikely. > >Which programs do i miss? > >There are simply too little parallel programs Bob, this is the key problem! > >There is crafty... ...and crafty... ...and crafty... ...o yes and DIEP > >And the rest hardly posts anything about their parallel implementation >and speedups anyway. > >So 50% of the programs where speedup results get published from is >getting a > 2.0 speedup! > >Note that some positions also Crafty goes >faster than 2x at 2 cpus. Did you forget that? Nope. In one of every 20 cases, maybe. That is _expected_ since move ordering can never be perfect. (more on next post, netscape is fixing to crash)...
This page took 0.07 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.