Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Would This Configuration Work?

Author: Robert Henry Durrett

Date: 08:16:32 05/31/02

Go up one level in this thread


On May 31, 2002 at 01:19:57, David Dory wrote:

>>You have some very interesting ideas in your post.  But before getting to them,
>>I want to make sure we are "singing on the same sheet of music," so to speak.
>>My interest in this topic developed while thinking about the upcoming >DF/Kramnik match in October. I tried to imagine a configuration which would >have a reasonable chance of being built in time for that match.  Since no one >has posted anything here about the computer that is to be used for that match, >I found myself speculating as to what could possibly be planned.  The first
>>thought that came to me was:  "Whatever it is, it's going to have to be
>>something they can build and deliver by August, if Kramnik is to have an
>>opportunity to play with it for a few weeks before the match."  This led to
>>recall of the often quoted principle "Keep it Simple Stupid [KISS]."  To me,
>>that meant that anything that would require extensive design, development, and
>>testing would be absolutely out of the question.  This prompted the question
>>"What could be put together in just a few months?"  Although non-optimal, the
>>configuration I asked about seemed to me to be one possible choice for that
>>match.
>=============
>This configuration is already available. It's called a "cluster". It has a
>number of independent m/board's fitted into a common bus (backplane). Each mobo
>works independently of the other, and has it's own memory, etc.

That brings back memories.  I have been monitoring this bulletin board off and
on for a number of years now.  I vaguely recall bulletins discussing clusters.
Also saw clusters discussed in technical journals before I retired.  [Don't look
at technical journals now since no motivation to do so.]

At the risk of sounding like an irritating broken record, the key issue in my
mind is the speed of the microprocessors.  [Also speed of the boards.]  Although
the cluster configuration concept has been around for years, the clusters built
and evaluated always are limited by the availability of suitable components.  A
cluster built two years ago necessarily had to use some comparatively slow
hardware, for example.  It does no good to use a 100% efficient design if the
components have to be terribly slow!

>
>The computer for the Kramnik match is (IIRC), already chosen. It's from Compaq.

Forgive me, but I forgot what IIRC means.  I used to have a list of acronyms
commonly used on bulletin boards but seem to have misplaced it.  [It's probably
on my desk somewhere, buried under a pile of papers!]

>=============
>>
>>The first thought was to have separate chess computers operating almost
>>independently.  This would mean almost no communication between the individual
>>computers during most of the time between the opponent's last move and the
>>computer's next move.  Similarly, for "thinking on the opponent's time," there
>>would be "almost no communication between the individual computers during most
>>of the time between the computer's last move and the opponent's next move."
>>
>>This obviously would imply a performance hit.  But to compensate for that
>>inefficiency, it would be possible to use VERY fast processors.  The added >speed of the eight individual computers was imagined to more than compensate >for the inefficiency, in view of the schedule constraint.  Currently available
>>8-processor computers are not very fast.  I doubt that any new processors would
>>be built just for this match.  The cost would be prohibitive.
>>_ _ _ _ _ _ _ _ _ _ _ _ _ _
>==========
>Yes. No new CPU will be made just for a chess match! Intel is a fan of chess,
>(they've sponsored a tournament or two), but not crazy!

Are you sure?  Some of these companies give the likes of Kasparov and Kramnik
millions of dollars just to get a little publicity!  Crazy!

>==========
>>
>>>About a month or so back, a poster gave out a url for a "cluster" which I
>>>loosely define as an arrangement of CPU's with their own memory space - ie. no
>>>shared memory.
>>>
>>>Just for fun, think of a single processor (regular) computer, and into each of
>>>the PCI slots (could be any kind of slots, however) on the motherboard, we'd
>>>have a single card computer (CPU, memory, etc.) which would communicate directly
>>>over the PCI bus with the master CPU.
>>>
>>>It's no problem having the CPU's not repeating each other's work in the >>search, but the cost of that is a tremendous amount of communication with the >>master CPU.
>>
>>Whenever I see words like "tremendous amount," I immediately ask "How much?"
>>Before your total concept is rejected, it seems necessary to quantify this and
>>then, and only then, reach any conclusions as to whether or not this is a "show
>>stopper."
>
>That's exactly * IT *, as I said before. I know of ABSOLUTELY NO TESTS of this
>concept. If you understand the bandwidth problems, (and believe me, I understand
>them about as LITTLE as they can be), you have to say that THEORETICALLY it
>shouldn't work very well (ie, 1 or 2 ply extra in the search).

Don't be afraid to talk about information bandwidth and Shannon's theorems if
you want to.  There are quite a few people here who can discuss them
intelligently.  [I've seen the bulletins.]  Back in the 70s, when I was working
on my MS degree in Electrical Engineering, my major was "Communication and
Information Theory."  We studied all that good stuff then, but all the examples
and problems were in the context of communication systems.

Back then, "Computer Science" was not much more than a wet dream yet. [Hope the
monitors didn't see that!] Nowadays, the computer science graduate students
routinely and blissfully study information theory in the context of design of
computer software systems and think there is nothing special about it.  I have
not been exposed to information theory in that context but many here have.

Go ahead. Dive in! Get wet!!  [Oops!  I did it again!  Sorry.]

>
>>Well, your guess is as good as mine.  But you should still want to
>>quantitatively evaluate this before rejecting your idea.
>
>I agree 100%.
>
>>Generally, I agree that a "smart" partitioning strategy would entail some
>>up-front analysis in preparation for assignment of the different possible
>>two-ply sequences to the eight "computers" or processors.
>>
>>It seems that there are many possible ideas for "smart" partitioning.  One idea
>>might be as follows:  Based on a very quick preliminary assessment [performed
>>for a dozen seconds or so after the last move was made] a list of all possible
>>two-ply move sequences, listed in priority order, would be produced.  The top
>>priority move could then be assigned to the first fast processor, the next
>>highest to the next processor, etc., cycling back to the first processor for the
>>ninth highest priority move, and continuing this process until all possible
>>two-ply move sequences had been allocated.  Then the eight processors could
>>begin their analyses in earnest.
>==========
>Yes, but it wouldn't take a competent program more than two seconds to produce
>this two ply move list.

I guess we need an input here from someone more familiar with this.  I don't
have a good feel for how long it would take. Perhaps finding the most attractive
eight or sixteen two-ply sequences would be sufficient.  That would save a
little time.  Also, it may not be necessary to have a "high fidelity"
evaluation.  A rough estimate might be sufficient.  There are tradeoffs here.
>==========
>>
>>>Such a search (because remember, we don't KNOW in advance, what moves are good
>>>ones, otherwise we'd only choose those, hell with the search :-) ), needs a ton
>>>of communication to divide up the work amongst all the processors as the game
>>>tree takes "shape" during the search.
>>
>>About communication:  In your configuration, this communication would be at
>>about 400 MB/sec at the most.  It is not clear that "a ton" of communication is
>>needed, however.  It seems to depend on whether or not you intend communication
>>between the processors as they are working.  If not, maybe not so much
>>communication is needed.
>================
>I'm not sure if PCI bus will support 400 MB/sec or not. Sounds high, maybe??

True.  I picked 400 MHz as an "upper bound."

>
>Chess programs ALL rely on lots of communication between the functions. Trying
>to parallelize the work w/o shared memory would be something entirely new (at
>least to me).

Another "broken record":  The only motivation for selecting a non-optimal design
was to enable use of very fast processors.  I suspected that the speed of the
processors was more important than the efficiency of the design.  If fast
processors could be used, no one in his/her right mind would select anything
except the most efficient design.

>================
>
>
>>In the configuration which I had imagined, each processor would be in it's own
>>box, and each box would be rack mounted in a standard 19" rack.  The individual
>>computers would be connected with a network.  The question this immediately
>>raises is:  How fast could such a network be?
>
>Firewire is the fastest comm technology I know of. Second would be the type 2
>USB's, (which are many times faster than the type 1 USB's).
>
>Since you have the hardware bible, what's the throughput for the PCI bus?

In the same table 20.1, Rosch gives 120 to 240 mbps for USB-2.0.

>
>Next question is what's the throughput for a passive backplane?

Maybe that's in Rosch's new book, but I haven't seen it there yet.  Just got the
book and haven't read it all. [1348 pages of text!!!!!!!]

>
>How do these all compare?
>
>I'm sure the networking could ultimately connect far more computers, but I
>believe the bus, Firewire or USB2's would be faster.

What I was considering was a "network" [or "bus"] which would be completely
enclosed by a standard 19" rack.  The small size would facilitate speed.  A much
larger network/bus might be a real mess, especially if messages had to be routed
through routers [and much more] as on the Internet.  Probably totally
impractical if speed is the objective.

I am not up to speed on the state of the art on ultra-fast busses and networks.
The information on FireWire caught my eye but don't know how well that [and
similar] technology has been implemented in real-world networks &/or busses.

>
>
>>It was more useful to me to think about the speed increase.  If a single
>>computer were to take eight minutes to find a certain hard-to-find move, then
>>maybe the "divide and conquer" strategy would find that same move in a minute >or two.
>
>>Maybe a better measure would be rating points increase over just using one
>>processor.  Although inefficient, when compared to the case where the eight
>>processors were in continual intelligent communication at all times, there >still should be a rating points increase.  It might not take much of an >increase in rating points to make the difference between Kramnik winning [as >he would most likely do against a single processor Fritz] and the 8-processor >Fritz chess computer winning.
>>
>============
>Maybe, maybe!!

All "computer-chess geeks" surely are rooting for the computer.  Right?

>============
>>>
>>>In implementing such a beast, it would be desireable to have only the most
>>>minimal info returned from the slave CPU's to the master CPU.
>>>
>>>The performance at best would be just a tad better than a single fast CPU
>>>set-up, but it would cost as much or more than some quad's (which really do >>run the parallel chess programs quite fast).
>>
>>I would prefer to quantify this before characterizing it as "just a tad."
>
>As is only right. Unfortunately, you have to judge the strength of the river's
>current, * BEFORE * you decide to cross or not. It isn't possible to go out and
>stand in the middle of the fastest, deepest part of the river and THEN decide
>"Should I try and cross this river or not?".
>
>The only real way to do this would be to invest in some hardware, make the
>modifications needed in the software (and that may not be trivial), set up the
>control properly, and test it.
>
>>>I'm sorry I can't give you any FACTS in this matter. In all the years I've >>been following computer chess, I've never seen a beastie configured like this,
>>>actually optimized and tested with a chess program.
>>
>>Guess what.  I can't produce any FACTS either.  We are just thinking out loud.
>>That should be OK.
>
>** No!! ** <grin>
>
>Please get back to me with the "speed limits" for PCI and USB2. I'll check into
>the throughput for the passive backplanes, and check out the single card
>computers and rack mountables.

There is a lot of information in Rosch's book about PCI.  It will take me awhile
to digest it all.

>
>It shouldn't be ALL that difficult to toy with this idea a while. What do you
>think? Wouldn't be used for Kramnik, though, I'm sure. :-(
>
>Dave
>
>P.S. Sorry to answer "inside" your post, but you have too many good things to
>chat about. It's all YOUR fault! <grin>

Me too!  :)

- - - - - - - -

Bob D.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.