Author: Matt Taylor
Date: 11:49:50 12/05/02
Go up one level in this thread
On December 05, 2002 at 10:02:21, Bob Durrett wrote: >On December 04, 2002 at 21:13:40, Matt Taylor wrote: > >>On December 04, 2002 at 20:29:52, Bob Durrett wrote: >> >>> >>>The recent threads shed some light on the issue of when one is more important >>>than another, but the answer is sketchy and seems to be "depends." >>> >>>For current chess-playing programs, which is more important? Latency or >>>bandwidth? Why? >>> >>>Is the answer different if multiple processors are used? >>> >>>Bob D. >> >>The answer is always "depends." It depends on how you access memory, how much >>memory you access, and how often you access memory. >> >>I'm going to make the simplification here that the CPU accesses memory directly; >>some of the work done here is actually part of the chipset, but that's just a >>technical detail and doesn't change any of the conclusions. >> >>In order for an algorithm to be sensitive to bandwidth, it must be accessing >>memory (almost) serially. When the CPU issues a read/write request to main >>memory, it sends the address in two pieces: the row and the column. Sometimes >>the row and column bits are mangled for performance, but for simplicity let's >>assume that the row is the upper half of an address and column is the lower >>half. >> >>The CPU doesn't actually transmit both row -and- column every time it accesses >>memory. The memory module has a row register that remembers which row you >>accessed previously. This isn't just an optimization, either; it reduces power >>constraints and has some other interesting effects for EE people. Anyway, when >>the row changes, the module is forced to "close" the current row and "open" the >>other row. The open process takes some time as the cells in the row must be >>percharged. Avoiding a row change makes memory access faster. The column works >>in a similar fashion. The CL value for ram is the CAS (column address strobe) >>latency, the latency of changing the column address. >> >>Now, if you're accessing memory randomly or in some fashion that requires the >>row or column to change, you will often incur one (or both) CAS and RAS >>latencies. This would make your algorithm latency-dependent. >> >>When multiple processors are used, the answer is a little more obscure. Now that >>both processors are competing for the same memory, each has less bandwidth. Does >>the algorithm spend a -lot- of time in between each memory access? However, at >>the same time, the memory accesses between both processors usually changes the >>row and column. This means the latency is incurred on many cycles. >> >>Notably, though, not all SMP systems are shared-bus. The upcoming x86-64 Opteron >>chips from AMD includes a bus per CPU. > >Incidentally, I forgot to ask: > >Does "hyperthreading" impact the answers to the questions? If so, how? > >Bob D. Hyperthreading is essentially a poor man's SMP. Instead of 2 CPUs running side-by-side, Intel duplicates the "logical" CPU, the machine registers, and allows 2 to compete for the existing execution resources on one chip. The result is that one program is doing floating-point while the other does integer number-crunching, and you get a higher throughput. The actual increase depends on the scheduling of your uses of different resources. (On the P4, I think both end up being able to do integer stuff at the same time.) I haven't thought about it long and hard, but I believe hyperthreading would have the same effect as SMP. It may be somewhat alleviated for an SMP application like Crafty because both "logical" CPUs also share caches, and it means 2 cache hits when they execute the same code. However, you also have the potential to cache thrash... -Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.