Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: New intel 64 bit ?

Author: Vincent Diepeveen

Date: 17:13:06 07/03/03

Go up one level in this thread


On July 03, 2003 at 18:15:01, Robert Hyatt wrote:

>On July 03, 2003 at 16:50:29, Chris Hull wrote:
>
>>On July 03, 2003 at 13:03:13, Robert Hyatt wrote:
>>
>>>On July 03, 2003 at 05:51:51, Russell Reagan wrote:
>>>
>>>>On July 03, 2003 at 05:31:15, Tony Werten wrote:
>>>>
>>>>>http://www.digitimes.com/NewsShow/Article.asp?datePublish=2003/07/01&pages=02&seq=3
>>>>>
>>>>>Tony
>>>>
>>>>Interesting news. Some things the article says makes me think this is nothing to
>>>>get excited about.
>>>>
>>>>"targeting the high-priced, back-end server market" - This makes me think
>>>>"nothing new here, the Itanium has been out of the price range of everyone for
>>>>years anyway." I can't imagine them competing with the Opteron (much less
>>>>Athlon64) if they can't come way down in price.
>>>>
>>>>It says something about a lower end cpu for workstations, but the way they put
>>>>it (maybe it's just the writer), it makes it sound (to me) that the high-end
>>>>Itanium will still be significantly more than the Opteron, and the low-end
>>>>Itanium will still be significantly more than the Athlon64, and that the
>>>>really-low-end Xeon might be in the price range of the Opteron.
>>>>
>>>>
>>>>"Intel servers containing eight to 128 Itanium processors..."
>>>>
>>>>So Bob, what is the expected speedup of Crafty on a 128-Itanium machine? :)
>>>
>>>
>>>Hard to say since it is a NUMA type machine.  There are lots of issues
>>>there.
>>
>>Ok, this begs the question, "Can crafty be made to work on a NUMA-type cluster?
>>How about in a messaging passing cluster using PVM or MPI?" Not just made to
>>work but to actually see SMP like speedups, for 4/8/16/32/64 node clusters.
>>
>>Chris
>
>
>The answer to both is "yes".
>
>NUMA is a problem, but it is solvable.  The problem is that the current
>way of allocating "split blocks" is not good for NUMA machines.  A NUMA
>machine _really_ wants its often-accessed data to be in its local memory,
>and I don't have any way of forcing that at the moment.  It would not be
>terribly difficult to change it, by allocating a bunch of split blocks on
>each CPU/local-memory, and then ensuring that the right split block is used
>for the right processor.  But on an SMP box, this is moot so it was not done
>in the original design.
>
>Clustering is harder, since suddenly there is no shared memory at all,

He asked MPI library that *means* not using shared memory.

Also all those itanium things are sold as 'clusters'.

Latency from good old origin3800 with MPI even is way better than from the new
SGI Altix3000 with Madisons and MPI.

That's weird because the design looks ok to me.

Then please consider that this altix is kicking butt compared to other itanium
clusters at that too much praised TPC bench.

>which changes both the overall structure of the program as well as the
>underlying assumption that "it is easy to do a quick parallel search and
>get a result back" because network latency suddenly turns something quick
>into something with a significant latency.
>
>SMP-like speedups are likely not possible for chess, because of the way the
>alpha/beta algorithm is built around sequential searching.  But reasonable
>speedup is definitely possible.  Who cares if 1000 processors is only 100X
>faster.  100X is _way_ faster.

How to get 100x faster out of 1000 cpu's with MPI?

Please tell me. It is not so trivial.

I got 500 processors with OpenMP, way better than MPI. But still very hard to
get a good speedup with. Working a year nearly already. It is *not* trivial.

Best regards,
Vincent



This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.