Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: New crap statement ?

Author: Tony Werten
Date: 11:46:02 09/25/01
On September 25, 2001 at 11:34:35, Miguel A. Ballicora wrote:

>On September 25, 2001 at 10:38:33, Tony Werten wrote:
>
>>On September 25, 2001 at 10:29:56, Robert Hyatt wrote:
>>
>>>On September 25, 2001 at 03:55:23, Tony Werten wrote:
>>>
>>>>On September 24, 2001 at 22:49:02, Robert Hyatt wrote:
>>>>
>>>>>On September 24, 2001 at 21:49:52, Vincent Diepeveen wrote:
>>>>>
>>>>>>
>>>>>>So you say kind of here that it is NEVER possible in your viewpoint to
>>>>>>get a 2.0 speedup when only measuring search times?
>>>>>>
>>>>>>Please give a clear statement here.
>>>>>>
>>>>>>Yes or No.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>OK.  For an average case, which is made up of multiple positions, the answer is
>>>>>"no" unconditionally.
>>>>>
>>>>>For a single test case here and there, the answer is _obviously_ "yes".
>>>>>
>>>>>_everybody_ has reported a >2 speedup for 2 cpus on a random position or two
>>>>>out of a set.  This is expected since we are dealing with randomness in the
>>>>>move ordering and random can sometimes be good, sometimes bad.
>>>>>
>>>>>Look at my DTS numbers.  at least one > 2, several well below 2.
>>>>>
>>>>>Average is a scientific-looking 1.9...
>>>>>
>>>>>If you report >2.0 speedups consistently, you are simply going to get
>>>>>laughed at over and over.  It is just not possible unless your serial
>>>>>search is bad.  So either your results (parallel search) are bad, or
>>>>>your serial search is bad.  In either case, "bad is bad".
>>>>
>>>>I'll go for a 3rd one. The measurement are not related.
>>>>
>>>>Making an algorithm parallel changes the algorithm and can therefor not be
>>>>compared to the previous algorithm.
>>>
>>>
>>>I don't see this as a problem.  The issue still remains, "can you do something
>>>parallel that executes more than 2x faster than what you do serially?"  If the
>>>answer is "yes" then there is something basically wrong with your serial
>>>algorithm...
>>>
>>>
>>>
>>>
>>>>
>>>>FE The second thread can make use of entries in the hashtable, which ( if it was
>>>>singlethreaded ) could have been replaced by different entries.
>>>
>>>This is non-deterministic behavior that is well-known in parallel search.
>>>But it is a random element that won't happen most of the time.  It will
>>>pop up on occasion, just like programs solve fine 70 quicker than they
>>>should due to inferior move ordering.
>>>
>>>But over a wide test set, this random improvement becomes "noise".
>>>
>>>
>>>
>>>
>>>>
>>>>Of course, the different way around works as wel. The second thread cannot use
>>>>entries that haven't been placed in the hashtable yet, but this is called
>>>>overhead. (The first possibility should be called "underhead". :)
>>>>
>>>>The occurence of this makes the whole measurement quite random, altough on
>>>>average it seems normal. The only way to measure the exact speedup of a
>>>>multiprocessor is to compare it with the same number of threads on a single
>>>>processor, and this can never be more than the number of cpu's.
>>>>
>>>>If you find a way to increase the locality of the hashtable for different
>>>>threads you should be able to improve more than the no processors.
>>>>
>>>>Maybe Vincent is doing this with his elaborated move ordering, or maybe it's his
>>>>multiple probe stuff. It's easy to check. He should let the dual version run on
>>>>a single processor, if it is faster than the single version, he does have his
>>>>more than linear speedup.
>>>>
>>>>Note: Multiple threads on a single CPU is still a parallel search, so if the
>>>>speedup is there he did not improve his serial search but his parallel search.
>>>
>>>No... multiple threads on a single CPU is _not_ parallel.  Nothing is being
>>>executed in parallel since there is one instruction stream per cpu.  That isn't
>>>the same thing at all.
>>
>>I understand, but the idea is that the search is parallel (although it is
>>executed serially in a multitasking environment ) wich should give very related
>>results to a real parallel search.
>
>Which makes the maximum theoretical speedup going from a 1x processor machine
>to installing a second one (2x) of 2.0 + A
>being A = overhead of using one processor compared to two when a multithreading
>program is used.

- B

B being the "underhead". The profit you get from being able to get entries from
the hashtable that otherwise would be replaced.

Point is how to get B bigger than A

The only way is to increase the locality of search. Kind a like the 2 level
hashtable works. 1st level is for the deep searches, 2nd level (always replace)
for the local searches.

Maybe a multilevel hashtable would work, maybe multiple probe is actually a
multilevel hashtable.

Just some thoughts.


>I guess that A is veryy small but certainly not cero, but almost.

For 2 processes the overhead is small, so B doesn't have to be very big either.

>
>So, if any program has a speed up of, say, 2.2 compared to the single threaded
>search, it would be better to run the SMP program in a single processor rather
>than the single processor version.

That would be the consequence yes. The idea would be to first do all the related
nodes, before those entries are replaced by less related ones.

Tony

>
>Regards,
>Miguel
>
>
>>
>>Tony
>>
>>>
>>>
>>>
>>>>
>>>>Anyone, please jump in where I go wrong.
>>>>
>>>>cheers,
>>>>
>>>>Tony
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.