Author: Miguel A. Ballicora
Date: 08:34:35 09/25/01
Go up one level in this thread
On September 25, 2001 at 10:38:33, Tony Werten wrote: >On September 25, 2001 at 10:29:56, Robert Hyatt wrote: > >>On September 25, 2001 at 03:55:23, Tony Werten wrote: >> >>>On September 24, 2001 at 22:49:02, Robert Hyatt wrote: >>> >>>>On September 24, 2001 at 21:49:52, Vincent Diepeveen wrote: >>>> >>>>> >>>>>So you say kind of here that it is NEVER possible in your viewpoint to >>>>>get a 2.0 speedup when only measuring search times? >>>>> >>>>>Please give a clear statement here. >>>>> >>>>>Yes or No. >>>>> >>>>> >>>> >>>> >>>>OK. For an average case, which is made up of multiple positions, the answer is >>>>"no" unconditionally. >>>> >>>>For a single test case here and there, the answer is _obviously_ "yes". >>>> >>>>_everybody_ has reported a >2 speedup for 2 cpus on a random position or two >>>>out of a set. This is expected since we are dealing with randomness in the >>>>move ordering and random can sometimes be good, sometimes bad. >>>> >>>>Look at my DTS numbers. at least one > 2, several well below 2. >>>> >>>>Average is a scientific-looking 1.9... >>>> >>>>If you report >2.0 speedups consistently, you are simply going to get >>>>laughed at over and over. It is just not possible unless your serial >>>>search is bad. So either your results (parallel search) are bad, or >>>>your serial search is bad. In either case, "bad is bad". >>> >>>I'll go for a 3rd one. The measurement are not related. >>> >>>Making an algorithm parallel changes the algorithm and can therefor not be >>>compared to the previous algorithm. >> >> >>I don't see this as a problem. The issue still remains, "can you do something >>parallel that executes more than 2x faster than what you do serially?" If the >>answer is "yes" then there is something basically wrong with your serial >>algorithm... >> >> >> >> >>> >>>FE The second thread can make use of entries in the hashtable, which ( if it was >>>singlethreaded ) could have been replaced by different entries. >> >>This is non-deterministic behavior that is well-known in parallel search. >>But it is a random element that won't happen most of the time. It will >>pop up on occasion, just like programs solve fine 70 quicker than they >>should due to inferior move ordering. >> >>But over a wide test set, this random improvement becomes "noise". >> >> >> >> >>> >>>Of course, the different way around works as wel. The second thread cannot use >>>entries that haven't been placed in the hashtable yet, but this is called >>>overhead. (The first possibility should be called "underhead". :) >>> >>>The occurence of this makes the whole measurement quite random, altough on >>>average it seems normal. The only way to measure the exact speedup of a >>>multiprocessor is to compare it with the same number of threads on a single >>>processor, and this can never be more than the number of cpu's. >>> >>>If you find a way to increase the locality of the hashtable for different >>>threads you should be able to improve more than the no processors. >>> >>>Maybe Vincent is doing this with his elaborated move ordering, or maybe it's his >>>multiple probe stuff. It's easy to check. He should let the dual version run on >>>a single processor, if it is faster than the single version, he does have his >>>more than linear speedup. >>> >>>Note: Multiple threads on a single CPU is still a parallel search, so if the >>>speedup is there he did not improve his serial search but his parallel search. >> >>No... multiple threads on a single CPU is _not_ parallel. Nothing is being >>executed in parallel since there is one instruction stream per cpu. That isn't >>the same thing at all. > >I understand, but the idea is that the search is parallel (although it is >executed serially in a multitasking environment ) wich should give very related >results to a real parallel search. Which makes the maximum theoretical speedup going from a 1x processor machine to installing a second one (2x) of 2.0 + A being A = overhead of using one processor compared to two when a multithreading program is used. I guess that A is veryy small but certainly not cero, but almost. So, if any program has a speed up of, say, 2.2 compared to the single threaded search, it would be better to run the SMP program in a single processor rather than the single processor version. Regards, Miguel > >Tony > >> >> >> >>> >>>Anyone, please jump in where I go wrong. >>> >>>cheers, >>> >>>Tony
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.