Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Latency versus Information Bandwidth: Questions

Author: Tony Werten
Date: 00:24:02 12/05/02
On December 04, 2002 at 23:23:32, Robert Hyatt wrote:

>On December 04, 2002 at 21:58:27, Jeremiah Penery wrote:
>
>>On December 04, 2002 at 21:13:40, Matt Taylor wrote:
>>
>>>On December 04, 2002 at 20:29:52, Bob Durrett wrote:
>>>
>>>>
>>>>The recent threads shed some light on the issue of when one is more important
>>>>than another, but the answer is sketchy and seems to be "depends."
>>>>
>>>>For current chess-playing programs, which is more important?  Latency or
>>>>bandwidth?  Why?
>>>>
>>>>Is the answer different if multiple processors are used?
>>>>
>>>>Bob D.
>>>
>>>The answer is always "depends." It depends on how you access memory, how much
>>>memory you access, and how often you access memory.
>>>
>>>I'm going to make the simplification here that the CPU accesses memory directly;
>>>some of the work done here is actually part of the chipset, but that's just a
>>>technical detail and doesn't change any of the conclusions.
>>>
>>>In order for an algorithm to be sensitive to bandwidth, it must be accessing
>>>memory (almost) serially. When the CPU issues a read/write request to main
>>>memory, it sends the address in two pieces: the row and the column. Sometimes
>>>the row and column bits are mangled for performance, but for simplicity let's
>>>assume that the row is the upper half of an address and column is the lower
>>>half.
>>>
>>>The CPU doesn't actually transmit both row -and- column every time it accesses
>>>memory. The memory module has a row register that remembers which row you
>>>accessed previously. This isn't just an optimization, either; it reduces power
>>>constraints and has some other interesting effects for EE people. Anyway, when
>>>the row changes, the module is forced to "close" the current row and "open" the
>>>other row. The open process takes some time as the cells in the row must be
>>>percharged. Avoiding a row change makes memory access faster. The column works
>>>in a similar fashion. The CL value for ram is the CAS (column address strobe)
>>>latency, the latency of changing the column address.
>>>
>>>Now, if you're accessing memory randomly or in some fashion that requires the
>>>row or column to change, you will often incur one (or both) CAS and RAS
>>>latencies. This would make your algorithm latency-dependent.
>>>
>>>When multiple processors are used, the answer is a little more obscure. Now that
>>>both processors are competing for the same memory, each has less bandwidth. Does
>>>the algorithm spend a -lot- of time in between each memory access? However, at
>>>the same time, the memory accesses between both processors usually changes the
>>>row and column. This means the latency is incurred on many cycles.
>>>
>>>Notably, though, not all SMP systems are shared-bus. The upcoming x86-64 Opteron
>>>chips from AMD includes a bus per CPU.
>>
>>Current AthlonMP chipsets also have a seperate bus per CPU.  They use the same
>>EV6 bus as Alpha processors did (or still do?).  The memory modules shared,
>>whereas Hammer will have separate memory modules for each processor.
>
>
>The problem with that is it turns into a NUMA architecture which has its _own_
>set of problems.  One cpu connected to one memory module means that the other
>CPU can't get to it as efficiently...

IIRC they created a new buzzword for that: Hyper Transport. Haven't seen any
tests yet how well it really works, but it should improve the bandwith.

Tony

>
>IE this doesn't offer one tiny bit of improvement over a SMP-type machine with
>shared memory...  Unless the algorithm is specifically designed to attempt to
>lccalize memory references and duplicate data that is needed by both threads
>often...
>
>This might be an improvement for running two programs at once.  For one
>program using two processors, NUMA offers additional challenges for the
>parallel programmer...
Re: Latency versus Information Bandwidth: Questions Robert Hyatt 07:27:32 12/05/02
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.