Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: New crap statement ? Perpetuum mobile

Author: Robert Hyatt

Date: 15:27:56 10/02/01

Go up one level in this thread


On October 02, 2001 at 16:49:03, Bruce Moreland wrote:

>On October 02, 2001 at 13:50:29, Robert Hyatt wrote:
>
>>Absolutely not.  Except for "exceptional" circumstances.  IE take a chess
>>position with two moves at the root.  One of the moves leads to a very deep
>>and complex tree.  The other leads to a simple mate in 2.  If you search the
>>first one first, and then the second, you do a lot of work.  If you search both
>>in parallel, the second will finish almost instantly, give you a much better
>>value for alpha, and cause the first search to finish more quickly.  Net result
>>is a speedup that is > 2.  But there is a way to do that same thing on one
>>processor.  Search a few nodes on move 1, then a few nodes on move 2, and
>>bounce back and forth.  You quickly find (again) that move 2 leads to a quick
>>mate and use this to prune the other move.  And that super-linear speedup goes
>>away.
>
>"Exceptional circumstances" is good enough.  You declare this possibility to be
>against the laws of physics, and then you find a counter-example yourself, but
>rather than admitting that this destroys your argument, you put a box around it
>and say that this doesn't count -- just because.

No I didn't.  A computer processor can execute N instructions per second.  Two
can execute no more than 2N instructions per second.  The board example is a
classic parallel task that one person has problems with.  And that one person
does more work than if two people were doing it together.  That one person does
more work than the two people combined, in fact.

In alpha/beta this is _not_ the case. The one-person task is the most efficient
way to search the tree.  Alpha/beta is founded in sequential ordering.  So
trying to discuss whether it is possible to search a tree more than twice as
fast with two processors doesn't make any sense.  Because the one processor
search is _provably_ doing less work than the two processor search.  (My Ph.D.
dissertation has the math to explain this in detail, as did a paper in the
Journal of Parallel Processing around 1986-1987.)

>
>Imagine a game where mates in one are found with very high frequency thoughout
>the (binary) tree.
>
>It makes sense to write your sequential algorithm so that it checks for them
>before searching more deeply, assuming they are easy to check for.
>
>If you don't bother to do this, the parallel algorithm could be faster, because
>in essence it makes this optimization.

I agree.  But that is _the_ point.  A crappy serial algorithm vs a parallel
algorithm that happens to work better because the serial algorithm is bad.
But alpha/beta is not a bad serial algorithm.  It is simply the most efficient
way of searching a tree we currently have.  And most likely will remain so
forever.




>
>It is certainly true that you can make a sequential algorithm that solves this
>problem, but if you hadn't thought to do this, the parallel algorithm could be
>faster.

OK...  and I don't agree there.  But if that is published, it will be laughed
right out of the reviewer's office.  Because nobody cares about parallelizing
an algorithm that is worthless in the first place.  IE I can parallelize a
bubble-sort and get linear speedup.  But a heap-sort (serial) will be far
faster with one processor.  And the linear speedup suddenly looks ridiculous.




>
>The reason the parallel algorith could be faster is that the parallel algorithm
>is not the same algorithm as the sequential algorithm.  Making it parallel
>changes it significantly, and without knowing enough about the game being
>examined, it can't be predicted which algorithm is faster.
>

There is a great body of knowledge about alpha/beta.  From Knuth/Moore's paper,
which I used as the foundation of my paper on the parallel search of the same
tree.  The sequential property of alpha/beta is the bedrock the entire algorithm
is built on.  Violating that sequential requirement (which we do in our parallel
searches every day) incurs overhead that can't be avoided.  And overhead means
super-linear speedups are going to be rare rather than the norm.




>There is no crushingly important difference between chess and this other game.
>If you didn't know much about the nature of chess, you might think that chess
>could work this way, too.  That chess doesn't work this way has something to do
>with chess, not with atomic particles and laws of thermodynamics.


Chess isn't the issue.  Alpha/beta is.  I did a huge amount of experimental
tests in my Ph.D. dissertation.  On worst-first move ordering.  On best-first
move ordering.  And on "nearly-best-first" move ordering as we see in chess.
But the tree is independent of the game, from the search perspective.  IE my
parallel search works for checkers.  Othello.  you-name it.




>
>This doesn't sound like something that is violating any laws of physics, to me.
>You have two similar games, and one would result in a super-linear speedup if
>you take a certain algorithm and run it in parallel.  The other *might* not.
>Whether it does or not has nothing to do with laws of physics, it has to do with
>the nature of the algorithms you use, and the nature of the game itself.


Our semantic disagreement here is that I am not saying "a certain algorithm"
where that is not specific.  I am specifically addressing the alpha/beta
search.  I don't know if best-first will or will not produce a super-linear
speedup under the right circumstances.  But alpha/beta is a known animal with
a lot of theoretical analysis in the books...



>
>If you think the table case is bad, make a plywood box the size of a
>foot-locker, with no handles, and fill it with bricks to the point where you
>can't lift it.  Now push that through dry sand for a mile.  I don't know if
>you've ever pushed a heavy rectangular box through dry sand, but it's hell.
>
>Now get another person, each grab an end, and carry it through the sand for two
>miles.  This won't be any fun but at least you won't have a heart attack.  This
>must be a much better way to do it, however you care to measure.


Maybe or maybe not.  But you are not comparing the _same_ algorithm either.
If I drag it, that is _one_ algorithm.  If two lift it and carry it, that is
_another_ algorithm.  If the floor is steel, dragging it will be less work.
If it is sand, dragging might (or might not) be more work.  I used to have to
run in sand during high school football.  Carrying someone on my back.  I am
convinced it would have been easier to drag them.  Because their weight buries
you in the sand when it is dry.

But back to the main idea.  We are all using alpha/beta.  That fixes the
algorithm and work that must be done.  Beyond any scintilla of a doubt.  Because
the formulas are precise and clear as to how much work you must do to search
a specific tree using alpha/beta.  Searching the tree in _any_ other order
than best-first introduces more work most of the time, and no change in the
rest of the cases where the order is not important.

Doing that in parallel is not going to introduce some new and unexpected
side-effect that makes the search go faster.  It is going to introduce search
overhead in the bad cases (most of the cases), no overhead in those occasional
good cases, and negative overhead in the few cases where move ordering is so
bad that the parallel out-of-order search actually improves the move ordering
and reduces the total tree size.  But the speedup must be <= 2.0 for the average
case in alpha/beta, or else the search is not using alpha/beta with reasonable
move ordering.





>
>>So far, in all the parallel search literature I have read over the years, there
>>has been absolutely no case of a super-linear speedup on anything other than
>>an exceptional case here and there.  And when there are such super-linear
>>speedups, there are _always_ the inverse cases as well, where the 2-cpu speedup
>>is significantly less than 2.  I have _never_ seen an average speedup that is
>>super-linear, ever.
>
>It would be possible if the sequential algorithm did a bad job, and the parallel
>algorithm not only resulted in work-sharing, but improved the algorithm in some
>fashion.


Unfortunately Knuth/Moore hav _proved_ that there is no algorithm that will
search a minimax game tree with fewer nodes.  And I mean _no_ algorithm exists
that can beat alpha/beta in terms of nodes searched to get the true minimax
score of the game tree...



>
>Here is an example.
>
>You have an array of arbitrary (big) length L.  You want to write an algorithm
>that determines if there is a value in the array that is less than or equal to
>N.
>
>An obvious sequential algorithm traverses the array from 1..L, searching for a
>value less than N.  If it finds one, it stops.

That is an obviously bad algorithm for some cases, however.  But for the
typical case, where the value of N is uniformly distributed over the entire
array, the parallel algorithm will not produce a super-linear speedup in 1/2 of
the cases.  I am going to typically search L/2 entries to find a value <= N.
With two processors, I will search L/2 entries in 1/2 the time.  For a perfect
speedup of 2.0 for the average case.  For odd cases like the value you want is
at entry L/2+1, if you time-slice your search you will find it on the second
comparison.  Check the first number, then L/2+1, then 2, then L/2+2, etc.

But on average, this is a classic "good algorithm" with no sequential property
at all, and it should produce linear speedup easily.  Alpha/beta doesn't have
that non-sequential property however.  You can't jump to the second half of
the tree without having finished the first half, or the search space grows.
That is the problem.



>
>An obvious parallel algorithm, designed to run on two processors, puts one of
>them to work searching 1..L/2, and the other searching backwards from the end
>(L..L/2+1).  If either processor finds a value less than N, both stop.
>
>The parallel algorithm would exhibit a super-linear speedup if the array
>contained null-terminated strings, the array-length was the length of the string
>+ 1, and N was 1.  The second processor would always cut off on the first
>element it looked at, so with a sufficiently large L, this algorithm would
>destroy the sequential algorithm no matter how much parallel overhead was
>involved.

With that data, timeslicing would solve the problem.  Assuming that the value
is not uniformly distributed.  Otherwise it will, on occasion, "destroy" the
single-cpu case.  But without any sequential property, this doesn't compare to
alpha/beta, still.



>
>This would be a dumb way to process this data, but the issue here is not to find
>a SMART algorithm that returns a super-linear speedup, but rather to find ANY
>algorithm that does it, since you are being so "laws of physics" and "all the
>parallel search literature" about this.

If you integrate the tests over all possible arrays, you won't find that
super-linear speedup as any part of the average.  The average case will be
T/2...  some cases will be far faster, some will be no faster.  IE what if
the value is the last one in the first half of the list?  You get a speedup
of 1.0, or nothing.  When you average all those together, you get 2.0
which was my point.  An occasional super-linear speedup will happen.  But not
on a large set of experimental data.

For your list, try 6 numbers and but the one you want at any of the
6 possible cases. Assume each comparison takes one second for simplicity.

if the value you want is in

                   serial time   parallel time    speedup
#1, you need            1              1             1.0
#2, you need            2              2             1.0
#3, you need            3              3             1.0
#4, you need            4              1             4.0
#5, you need            5              2             2.5
#6, you need            6              3             2.0

average is 1.92   which is <= 2.0 as I said

two of 6 cases produced super-linear speedups.  one produced the perfect
2.0, the other 3 were no faster than the serial algorithm...


>
>And even though it's dumb way to do this, you might not have known it was a dumb
>way when you wrote it.  You might not have known what kind of data you were
>going to be dealing with beforehand.


See above.  The example really doesn't work...  I've done parallel programming
long enough that I _know_ that this stuff won't stand up to careful analysis.


>
>In this case, the average speedup is probably super-linear.  Note that
>sequential algorithms exist, but that doesn't matter.  We're talking about an
>obvious sequential algorithm modified to work in parallel, like what we do with
>alpha-beta.

Check out my example of your algorithm.  The average is not even linear.  not
quite...


>
>What does this mean?  It means that when some crazy person suggests that they've
>found a super-linear speedup in a chess program, you should at least LISTEN
>before you call them crazy (or stupid).

Ahem.  First find an algorithm that produces a super-linear speedup.  You
haven't done so yet...  I _always_ "listen" or I wouldn't have joined the
discussion.  But I also know that in this case, it is _impossible_.




>
>Is Vincent crazy?  Maybe we should take a poll.  One thing that is sure is that
>the laws of physics don't demand that he must be.
>
>bruce

no, but the laws of "Knuth/Moore" certainly say that his claim is flawed.  Look
at your example.  It _seems_ to be super-linear until you test all cases.  Then
it falls apart.  As they _all_ do on careful analysis.




This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.