Author: Uri Blass
Date: 21:48:21 09/21/01
Go up one level in this thread
On September 22, 2001 at 00:17:26, Robert Hyatt wrote: >On September 21, 2001 at 20:53:37, Vincent Diepeveen wrote: > >>On September 18, 2001 at 22:11:19, Robert Hyatt wrote: >> >>>On September 18, 2001 at 20:36:17, Vincent Diepeveen wrote: >>> >>>> >>>>You're not reading what i write and your example sucks, as the >>>>effect only happens at say 15 minutes a move, and running 2 processes >>>>at 1 cpu is too much overhead, only after 1 hour i get a speedup, >>>>which i never get in a game for each move. >>> >>> >>>OK... wait for 3 years for hardware 10x faster. then you can go more than >>>1x faster with one cpu. Which is going to be a _real_ trick to prove to >>>the rest of the computer science world. >>> >>> >>>> >>>>Of course i tested this bigtime! >>>> >>>>If your read a bit better what i mentionned then it's clear >>>>that no programs move ordering is very good in contradiction to some >>>>crap statements a few years ago in RGCC when not many except some >>>>commercial guys had measured their cutoff rates and their flip rates. >>>> >>>>>have a bug, or an anomaly position. It is simply not possible. If you >>>>>are _really_ doing that, you should modify your single-cpu version as >>>>>follows: >>>> >>>>Basically you also say here that Fritz is having a bug, because fritz >>>>has the same like DIEP here, only the difference is that the overhead >>>>from fritz is that big that it takes like half a day before a parallel >>>>search is giving more than 2.0 speedup. >>> >>> >>>I don't believe it for one second. I haven't even seen Frans claim a >>>speedup of 2.00 yet. Last I heard he said "I don't get the same speedup >>>as crafty does..." >> >>There is some hard evidence here. Note that fritz DOES produce the >>number of nodes it needs for each ply at the screen, in contradiction >>with crafty. >> >>Perhaps your parallel overhead is simply that huge that you hardly see >>that sometimes you need less nodes, and i bet you never compare single >>cpu node outputs with dual node outputs. > >two things. > >1. My parallel search has essentially zero overhead. The time it takes >me to "split" is so near zero as to be called zero. > >2. In everything I have ever written, from my PhD dissertation, through >the JICCA paper _always_ compared the single-processor node counts to the >multiple-cpu node counts. That is _exactly_ where I get my speedup formula >of S = 1 + (N-1) * .7 > >that lost .3 of every processor is search overhead. > > > >> >>Of course these effects are hell harder on 4 cpu's as i explained, but >>it seems you're not listening to that! >> >>> >>> >>>> >>>>Ah DANG, more programs have it. >>>> >>>>I bet you didn't know that!! >> >>>Nor did you, I'll bet. >> >>I was pretty amazed when i also saw Deepfritz having this indeed. >>The overhead from parallel search is however that big, that it takes >>a long time to see it! >> >>>> >>>>Everyone who has spent lots of times >>>>into parallellizing a program in a proper way is having this effect. >>> >>> >>> >>>Vincent, that is simply _false_. Want to bet I know more "everyones" than >>>you do? lets start. Me. Hsu. Newborn. Schaeffer. Waycool group at caltech. >> >>Your info is usually 20 years old. Like the Cray Blitz info. >> >>You're one of the pioneers of computer chess, especially parallel >>computer chess, let's not forget that, but this is 2001 and others >>are being creative now! >> >>Cray Blitz from 15 years ago would be blown away with a single >>P3-800Mhz nowadays, for several reasons >> a) progress in algoritms (R=3) >> b) the machine is hell slower than nowadays single cpus even are >> c) book >> d) better testing >> e) better tuning >> f) more knowledgeable facts are tested into nowadays software >> g) less dubious forward pruning by most commercials >> >>And the list goes on. You had excellent hardware for those days, but >>compared with programs from these days and it gets extrapolated to today. > > >You can repeat that crap every weak for the next 30 years. But that will >_not_ make it true. R=3 takes about 2 minutes to add to Cray Blitz. The >rest is _already_ there... I do not express an opinion about the possible speed improvement that you can get from parallel search today but I do not see that the rest is already there. 1)The fact that you need 2 minutes to implement R=3 in Cray blitz means nothing if you did not test R=3 when you tested the speed improvement from parallel search. 2)Vincent is right that the machine is hell slower than nowadays single cpus because he talks about Cray blitz from 15 years ago and not about the latest cray blitz that can search 7M nodes per second. I understand that his claim is about long time control and the only way to test if he is right is to take a lot of positions from games when the machine changes it's mind after a long search and to compare the time that it needs to do it without parallel search and the time that it needs to do it with parallel search when the hash tables are the same. I have positions when Deep Fritz changed it's mind after a long search from my correspondence games but I have not multi processor. If people are interested in testing it I may send them my positions and they may use them in order to test the speed improvement of Deep Fritz from parallel search. I agree that speed improvement of more than being 2 times faster from 2 processors means that it is possible to improve the program when it is using 1 proccesor. Uri
This page took 0.07 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.