Author: Robert Hyatt
Date: 19:54:44 10/04/01
Go up one level in this thread
On October 04, 2001 at 01:54:50, Miguel A. Ballicora wrote: >On October 03, 2001 at 18:53:31, Dave Gomboc wrote: > >>On October 02, 2001 at 16:49:03, Bruce Moreland wrote: >> >>>On October 02, 2001 at 13:50:29, Robert Hyatt wrote: >>> >>>>Absolutely not. Except for "exceptional" circumstances. IE take a chess >>>>position with two moves at the root. One of the moves leads to a very deep >>>>and complex tree. The other leads to a simple mate in 2. If you search the >>>>first one first, and then the second, you do a lot of work. If you search both >>>>in parallel, the second will finish almost instantly, give you a much better >>>>value for alpha, and cause the first search to finish more quickly. Net result >>>>is a speedup that is > 2. But there is a way to do that same thing on one >>>>processor. Search a few nodes on move 1, then a few nodes on move 2, and >>>>bounce back and forth. You quickly find (again) that move 2 leads to a quick >>>>mate and use this to prune the other move. And that super-linear speedup goes >>>>away. >>> >>>"Exceptional circumstances" is good enough. You declare this possibility to be >>>against the laws of physics, and then you find a counter-example yourself, but >>>rather than admitting that this destroys your argument, you put a box around it >>>and say that this doesn't count -- just because. >>> >>>Imagine a game where mates in one are found with very high frequency thoughout >>>the (binary) tree. >>> >>>It makes sense to write your sequential algorithm so that it checks for them >>>before searching more deeply, assuming they are easy to check for. >>> >>>If you don't bother to do this, the parallel algorithm could be faster, because >>>in essence it makes this optimization. >>> >>>It is certainly true that you can make a sequential algorithm that solves this >>>problem, but if you hadn't thought to do this, the parallel algorithm could be >>>faster. >>> >>>The reason the parallel algorith could be faster is that the parallel algorithm >>>is not the same algorithm as the sequential algorithm. Making it parallel >>>changes it significantly, and without knowing enough about the game being >>>examined, it can't be predicted which algorithm is faster. >>> >>>There is no crushingly important difference between chess and this other game. >>>If you didn't know much about the nature of chess, you might think that chess >>>could work this way, too. That chess doesn't work this way has something to do >>>with chess, not with atomic particles and laws of thermodynamics. >>> >>>This doesn't sound like something that is violating any laws of physics, to me. >>>You have two similar games, and one would result in a super-linear speedup if >>>you take a certain algorithm and run it in parallel. The other *might* not. >>>Whether it does or not has nothing to do with laws of physics, it has to do with >>>the nature of the algorithms you use, and the nature of the game itself. >>> >>>If you think the table case is bad, make a plywood box the size of a >>>foot-locker, with no handles, and fill it with bricks to the point where you >>>can't lift it. Now push that through dry sand for a mile. I don't know if >>>you've ever pushed a heavy rectangular box through dry sand, but it's hell. >>> >>>Now get another person, each grab an end, and carry it through the sand for two >>>miles. This won't be any fun but at least you won't have a heart attack. This >>>must be a much better way to do it, however you care to measure. >>> >>>>So far, in all the parallel search literature I have read over the years, there >>>>has been absolutely no case of a super-linear speedup on anything other than >>>>an exceptional case here and there. And when there are such super-linear >>>>speedups, there are _always_ the inverse cases as well, where the 2-cpu speedup >>>>is significantly less than 2. I have _never_ seen an average speedup that is >>>>super-linear, ever. >>> >>>It would be possible if the sequential algorithm did a bad job, and the parallel >>>algorithm not only resulted in work-sharing, but improved the algorithm in some >>>fashion. >>> >>>Here is an example. >>> >>>You have an array of arbitrary (big) length L. You want to write an algorithm >>>that determines if there is a value in the array that is less than or equal to >>>N. >>> >>>An obvious sequential algorithm traverses the array from 1..L, searching for a >>>value less than N. If it finds one, it stops. >>> >>>An obvious parallel algorithm, designed to run on two processors, puts one of >>>them to work searching 1..L/2, and the other searching backwards from the end >>>(L..L/2+1). If either processor finds a value less than N, both stop. >>> >>>The parallel algorithm would exhibit a super-linear speedup if the array >>>contained null-terminated strings, the array-length was the length of the string >>>+ 1, and N was 1. The second processor would always cut off on the first >>>element it looked at, so with a sufficiently large L, this algorithm would >>>destroy the sequential algorithm no matter how much parallel overhead was >>>involved. >>> >>>This would be a dumb way to process this data, but the issue here is not to find >>>a SMART algorithm that returns a super-linear speedup, but rather to find ANY >>>algorithm that does it, since you are being so "laws of physics" and "all the >>>parallel search literature" about this. >>> >>>And even though it's dumb way to do this, you might not have known it was a dumb >>>way when you wrote it. You might not have known what kind of data you were >>>going to be dealing with beforehand. >>> >>>In this case, the average speedup is probably super-linear. Note that >>>sequential algorithms exist, but that doesn't matter. We're talking about an >>>obvious sequential algorithm modified to work in parallel, like what we do with >>>alpha-beta. >>> >>>What does this mean? It means that when some crazy person suggests that they've >>>found a super-linear speedup in a chess program, you should at least LISTEN >>>before you call them crazy (or stupid). >>> >>>Is Vincent crazy? Maybe we should take a poll. One thing that is sure is that >>>the laws of physics don't demand that he must be. >>> >>>bruce >> >>Observing that the parallel algorithm was super-linearly quicker immediately >>gives rise to a superior sequential algorithm: alternate checking from the front >>of the array and the back of the array for the zero. It would quickly be >>observed that the parallel algorithm does not have super-linear speedup over >>this superior sequential algorithm. In addition, the sequential algorithm may >>be further optimized by ignoring the front of the array altogether, which would >>reduce the parallel speedup further. >> >>To ignore the superior sequential algorithm and instead publish results >>comparing only the original, poor sequential algorithm with the parallel >>algorithm is not merely disingenuous but simply bad science. >> >>Dave > >In a more general case, rewriting a parallel algorithm as a sequential one might >require heavy use of instructions that are not convenient (for instance, lots of >branches mispredicted or whatever you can imagine in future CPUs) creating some >overhead, perhaps small, but measurable overhead that makes the use of two >threads a little tiny bit more efficient. Of course using threads has their own >overhead, but the balance could hypothetically go either way. > >Regards, >Miguel I ignore this for multiple reasons: 1. SMP machines have a _definite_ interaction between the cpus. Blocked/deferred access to memory or memory conflicts. Shared bus that is overloaded. Etc. There is a _definite_ loss of performance with the second cpu, since a single cpu has enough memory problems on its own. 2. Threads don't have to have _any_ cost. I programmed a xerox sigma 9 for years. It had two sets of registers. I could flip back and forth between two threads, using two sets of registers, with zero overhead. Or at least closer to zero than the overhead of a dual-processor machine. 3. Algorithms are independent of this anyway. Because it is _always_ possible to find an algorithm that won't work well on a particular machine. And then compare that to a machine with twice the resources that will run that algorithm well. That isn't what this algorithm analysis is about. It is _very_ safe to assume that the dual-cpu machine and the single-cpu-running-two-threads cases are _identical_ in their overhead costs from the machine perspective. Which means it can be ignored...
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.