Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Latency versus Information Bandwidth: Questions

Author: Matt Taylor

Date: 13:44:43 12/06/02

Go up one level in this thread


On December 06, 2002 at 07:32:57, Vincent Diepeveen wrote:

>On December 05, 2002 at 01:14:18, Jeremiah Penery wrote:
>
>>On December 04, 2002 at 23:23:32, Robert Hyatt wrote:
>>
>>>>Current AthlonMP chipsets also have a seperate bus per CPU.  They use the same
>>>>EV6 bus as Alpha processors did (or still do?).  The memory modules shared,
>>>>whereas Hammer will have separate memory modules for each processor.
>>>
>>>
>>>The problem with that is it turns into a NUMA architecture which has its _own_
>>>set of problems.  One cpu connected to one memory module means that the other
>>>CPU can't get to it as efficiently...
>>>
>>>IE this doesn't offer one tiny bit of improvement over a SMP-type machine with
>>>shared memory...  Unless the algorithm is specifically designed to attempt to
>>>lccalize memory references and duplicate data that is needed by both threads
>>>often...
>>>
>>>This might be an improvement for running two programs at once.  For one
>>>program using two processors, NUMA offers additional challenges for the
>>>parallel programmer...
>>
>>According to all documentation, which I have no reason to doubt, a non-local
>>memory access in a Hammer system is just as fast as a memory access in a
>>processor/chipset combination where the memory controller resides in the
>>northbridge (i.e. all other x86 configurations).  Local memory accesses are
>>quite a lot faster.  Therefore, the average case, even in 8-way machines that
>>take up to 3 hops for a memory access, is still below that of any x86 machine of
>>today.
>
>If you read the documentation as it is you get confronted with
>theoretical data which doesn't take into account any part of
>the configuration which is worst case.
>
>Bob is more near the truth here than you might want to guess, because
>as soon as you go run on those supercomputers with theoretic performance
>of a certain peak and you go test yourself then the practical peak
>is up to 50 times slower than the theoretic data suggests.
>
>So on paper this is way faster and even works up to 8 cpu's (which is
>unlikely we ever will see working), as good propagandists those papers
>are not going to tell you weak spots in the design which prevent
>that *theoretic* performance from happening in reality.
>
>In case they get this dual CPU to work we will see what its speed is.
>
>For now i assume it's a cluster like Bob does.
>
>Note that it's nearly impossible to get to work a 8 cpu machine with
>that architecture. Imagine how complex design of it will be.
>
>Which OS will work on that?
>
>Best regards,
>Vincent

First of all, this is a crossbar. Other crossbar systems have scaled up to 64
nodes or so I've heard. Crossbar performance is -much- better than your typical
NUMA system. Economic crossbar systems take the same approach AMD is taking:
each node adds a crossbar to the system. I don't think that's coincidence.

Second of all, any OS that supports SMP on shared-bus will support Opteron. All
of the cache coherency and switching is done in hardware. Optimization can be
made by recognition that this is a NUMA architecture. However, I think the MP
1.4 spec which has been available for a couple years allows the specification of
NUMA configurations. Linux64 and Windows XP 64-bit are old, old announcements.

Third of all, AMD demonstrated a quad-Opteron system at Computex Taipei 2002 (in
addition to a fair number of other shows). The biggest hurdle in 8-way Opterons
is finding PCB real-estate. I don't think AMD would be promising such systems if
they were uncertain whether or not they could deliver, even if other companies
(Rambus) do.

Right now performance of Opteron systems is admittedly bad, but the performance
of most prototypes are. As for the chips themselves, an 800 MHz Clawhammer
prototype was reportedly faster than the 1.6 GHz Williamette in 32-bit code.

-Matt



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.