Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Latency versus Information Bandwidth: Questions

Author: Matt Taylor
Date: 11:49:50 12/05/02
On December 05, 2002 at 10:02:21, Bob Durrett wrote:

>On December 04, 2002 at 21:13:40, Matt Taylor wrote:
>
>>On December 04, 2002 at 20:29:52, Bob Durrett wrote:
>>
>>>
>>>The recent threads shed some light on the issue of when one is more important
>>>than another, but the answer is sketchy and seems to be "depends."
>>>
>>>For current chess-playing programs, which is more important?  Latency or
>>>bandwidth?  Why?
>>>
>>>Is the answer different if multiple processors are used?
>>>
>>>Bob D.
>>
>>The answer is always "depends." It depends on how you access memory, how much
>>memory you access, and how often you access memory.
>>
>>I'm going to make the simplification here that the CPU accesses memory directly;
>>some of the work done here is actually part of the chipset, but that's just a
>>technical detail and doesn't change any of the conclusions.
>>
>>In order for an algorithm to be sensitive to bandwidth, it must be accessing
>>memory (almost) serially. When the CPU issues a read/write request to main
>>memory, it sends the address in two pieces: the row and the column. Sometimes
>>the row and column bits are mangled for performance, but for simplicity let's
>>assume that the row is the upper half of an address and column is the lower
>>half.
>>
>>The CPU doesn't actually transmit both row -and- column every time it accesses
>>memory. The memory module has a row register that remembers which row you
>>accessed previously. This isn't just an optimization, either; it reduces power
>>constraints and has some other interesting effects for EE people. Anyway, when
>>the row changes, the module is forced to "close" the current row and "open" the
>>other row. The open process takes some time as the cells in the row must be
>>percharged. Avoiding a row change makes memory access faster. The column works
>>in a similar fashion. The CL value for ram is the CAS (column address strobe)
>>latency, the latency of changing the column address.
>>
>>Now, if you're accessing memory randomly or in some fashion that requires the
>>row or column to change, you will often incur one (or both) CAS and RAS
>>latencies. This would make your algorithm latency-dependent.
>>
>>When multiple processors are used, the answer is a little more obscure. Now that
>>both processors are competing for the same memory, each has less bandwidth. Does
>>the algorithm spend a -lot- of time in between each memory access? However, at
>>the same time, the memory accesses between both processors usually changes the
>>row and column. This means the latency is incurred on many cycles.
>>
>>Notably, though, not all SMP systems are shared-bus. The upcoming x86-64 Opteron
>>chips from AMD includes a bus per CPU.
>
>Incidentally, I forgot to ask:
>
>Does "hyperthreading" impact the answers to the questions?  If so, how?
>
>Bob D.

Hyperthreading is essentially a poor man's SMP. Instead of 2 CPUs running
side-by-side, Intel duplicates the "logical" CPU, the machine registers, and
allows 2 to compete for the existing execution resources on one chip. The result
is that one program is doing floating-point while the other does integer
number-crunching, and you get a higher throughput. The actual increase depends
on the scheduling of your uses of different resources. (On the P4, I think both
end up being able to do integer stuff at the same time.)

I haven't thought about it long and hard, but I believe hyperthreading would
have the same effect as SMP. It may be somewhat alleviated for an SMP
application like Crafty because both "logical" CPUs also share caches, and it
means 2 cache hits when they execute the same code. However, you also have the
potential to cache thrash...

-Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.