Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: New crap statement ? Perpetuum mobile

Author: Robert Hyatt

Date: 14:37:10 10/02/01

Go up one level in this thread


On October 02, 2001 at 17:11:46, Sune Fischer wrote:

>On October 02, 2001 at 13:50:29, Robert Hyatt wrote:
>
>>On October 02, 2001 at 12:16:48, Sune Fischer wrote:
>>
>>>Hi
>>>
>>>Just curious, does anyone know of a parallel algorithm which is faster than the
>>>serial one, for any type of problem?
>>
>>Absolutely not.  Except for "exceptional" circumstances.  IE take a chess
>>position with two moves at the root.  One of the moves leads to a very deep
>>and complex tree.  The other leads to a simple mate in 2.  If you search the
>>first one first, and then the second, you do a lot of work.  If you search both
>>in parallel, the second will finish almost instantly, give you a much better
>>value for alpha, and cause the first search to finish more quickly.  Net result
>>is a speedup that is > 2.
>
>Obviously it will be faster if you have move ordering reversed as you have here,
>but this case is not really interesting as it only happens a few percent of the
>time.

I agree.  That is why I said this happens on the occasional problem position,
but it is more than offset by the positions that produce a < 2.0 speedup.



>
>>But there is a way to do that same thing on one
>>processor.  Search a few nodes on move 1, then a few nodes on move 2, and
>>bounce back and forth.  You quickly find (again) that move 2 leads to a quick
>>mate and use this to prune the other move.  And that super-linear speedup goes
>>away.
>
>I think you misunderstood my question a little bit, when I say "serial
>algorithm" i refer to something that can be carried out on a single thread, it
>has nothing to do with what the algorithm does, it maybe complex as h*ll -
>doesn't matter, if it runs on 1 thread, its a serial algorithm.
>To state that a parallel algorithm is faster than a serial means, that the
>algorithm can be written to run faster using multiple threads than to use one
>thread, even if it's only running on 1 CPU.
>I have never of a case like that, and I don't believe it exists.

I'm not sure I follow.  I can write something to use multiple threads.  Or,
if I only use one cpu, I can also write the same code to "interlace" things
just as multiple threads would do, without the multiple threads.  The code
would be messy.  But it would simulate a parallel search on one cpu with no
threads required.  This is how I used to test Cray Blitz many years ago on our
vax, which (at the time) had nothing like the thread facility on the Cray and
I didn't want to try to find some funky VMS way to do threads...  since i would
never run on a vax in a real tournament.

>
>
>>So far, in all the parallel search literature I have read over the years, there
>>has been absolutely no case of a super-linear speedup on anything other than
>>an exceptional case here and there.  And when there are such super-linear
>>speedups, there are _always_ the inverse cases as well, where the 2-cpu speedup
>>is significantly less than 2.  I have _never_ seen an average speedup that is
>>super-linear, ever.
>
>I agree fully, linear speedup and you are happy.
>
>>>I know of many that can be done equally fast (in theory) at 2x half the power,
>>>for example vector operations. Fortran 90 (and above) has some great support for
>>>that.
>>
>>That isn't quite the same thing.  The issue is using two "equal processors" to
>>run > twice as fast.  A vector processor is not the same.  It isn't just 2x as
>>fast, for example.  It is a totally different architectural approach to
>>computation.
>
>I have shared memory systems in mind here, dual boards and such.
>Not sure what you mean exactly. A vector operation can go 2x as fast if you
>don't have bandwith problems.

It can go 10x faster as it does on the cray.  But that is not the same sort
of comparison.  The Cray does 2 vector operations per cycle (a sort of
super-scalar vector operation).  But it can also create a vector pipeline that
will produce two results every clock cycle, for as long a vector as you want,
while the scalar processor takes 3 clocks to add two 64 bit ints.  You can
stream in scalar instructions of course, but only if they are independent...




>
>>>I have never ever heard of a parallel algorithm that was faster than the serial
>>>one (on 1 CPU of cause).
>>>It seems the communication and coordination between threads plus the increased
>>>complexity will always cost you performance.
>>
>>For some applications, that are often called "embarassingly parallel"
>>(ray-tracing is one simple example that comes to mind) 2x is easy.  4x is
>>easy on 4 processors.  But not >2x with 2, nor > 4x with 4.  In ray tracing
>>there is almost no communication/coordination.  that is why is is called
>>"embarassingly parallel".
>
>Been there, done that ;)
>Still, these are just more examples of linear speedup, lets see if anyone can
>find that super linear one.
>
>-S.


There will be claims, of course. :)



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.