Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Latency versus Information Bandwidth: Questions

Author: Eugene Nalimov

Date: 09:40:14 12/06/02

On December 06, 2002 at 07:32:57, Vincent Diepeveen wrote:

>On December 05, 2002 at 01:14:18, Jeremiah Penery wrote:
>
>>On December 04, 2002 at 23:23:32, Robert Hyatt wrote:
>>
>>>>Current AthlonMP chipsets also have a seperate bus per CPU.  They use the same
>>>>EV6 bus as Alpha processors did (or still do?).  The memory modules shared,
>>>>whereas Hammer will have separate memory modules for each processor.
>>>
>>>
>>>The problem with that is it turns into a NUMA architecture which has its _own_
>>>set of problems.  One cpu connected to one memory module means that the other
>>>CPU can't get to it as efficiently...
>>>
>>>IE this doesn't offer one tiny bit of improvement over a SMP-type machine with
>>>shared memory...  Unless the algorithm is specifically designed to attempt to
>>>lccalize memory references and duplicate data that is needed by both threads
>>>often...
>>>
>>>This might be an improvement for running two programs at once.  For one
>>>program using two processors, NUMA offers additional challenges for the
>>>parallel programmer...
>>
>>According to all documentation, which I have no reason to doubt, a non-local
>>memory access in a Hammer system is just as fast as a memory access in a
>>processor/chipset combination where the memory controller resides in the
>>northbridge (i.e. all other x86 configurations).  Local memory accesses are
>>quite a lot faster.  Therefore, the average case, even in 8-way machines that
>>take up to 3 hops for a memory access, is still below that of any x86 machine of
>>today.
>
>If you read the documentation as it is you get confronted with
>theoretical data which doesn't take into account any part of
>the configuration which is worst case.
>
>Bob is more near the truth here than you might want to guess, because
>as soon as you go run on those supercomputers with theoretic performance
>of a certain peak and you go test yourself then the practical peak
>is up to 50 times slower than the theoretic data suggests.
>
>So on paper this is way faster and even works up to 8 cpu's (which is
>unlikely we ever will see working), as good propagandists those papers
>are not going to tell you weak spots in the design which prevent
>that *theoretic* performance from happening in reality.
>
>In case they get this dual CPU to work we will see what its speed is.
>
>For now i assume it's a cluster like Bob does.
>
>Note that it's nearly impossible to get to work a 8 cpu machine with
>that architecture. Imagine how complex design of it will be.
>
>Which OS will work on that?

Windows .Net Server.

Thanks,
Eugene

>Best regards,
>Vincent

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.