Author: Robert Hyatt
Date: 11:13:02 03/19/03
Go up one level in this thread
On March 19, 2003 at 12:32:09, Matt Taylor wrote: >On March 18, 2003 at 23:22:33, Robert Hyatt wrote: > >>On March 18, 2003 at 20:07:13, Aaron Gordon wrote: >> >>>Here are some tests that I've run on one of my machines. It's an AthlonXP 1900+ >>>@ 1.6GHz (non-overclocked), 133fsb on an Abit KT7a (KT133A) motherboard. The ram >>>type is regular SDRAM (non-DDR). 3 Dimms used, one 256mb and two 128mb's. So 3 >>>slots in use, 6 banks in use total (0-5). >>> >>>4-way interleave >>>Sisoft memory test: >>>ALU w/ SSE : 1002mb/s >>>ALU w/o SSE or MMX : 577mb/s >>> >>>CraftyK7 19.3: 1,046,116 nodes/sec >>> >>> >>>2-way interleave >>>Sisoft memory test: >>>ALU w/ SSE : 1002mb/s >>>ALU w/o SSE or MMX : 547mb/s >>> >>>CraftyK7 19.3: 1,046,116 nodes/sec >>> >>> >>>No Interleaving >>>Sisoft memory test: >>>ALU w/ SSE : 993mb/s >>>ALU w/o SSE or MMX : 517mb/s >>> >>>CrafyK7 19.3: 1,046,116 nodes/sec >>> >>> >>>As you can see interleaving helped nothing for crafty. Doubling ram bandwidth >>>while keeping all hardware, memory timings, etc identical also proved no >>>increase (unless you want to count 0.14% as something out of the margin of error >>>of the benchmark). >> >> >>That is interesting, but what it means I have no idea. I haven't run the >>test in a long time, but it is easy to get the new pentium processors to >>count and report the number of cache line misses while a program runs. Last >>time I tried it the number was huge, as expected, since nearly every hash >>probe is a miss for certain, and all the large tables used for move generation, >>bit counting, and so forth also drive this number up... ><snip> > >And that is perfectly explicable here. Interleaving doesn't help misses. Unless >you get concurrent memory accesses to different modules, interleaving doesn't >help at all. The CPU is OOOE, so some things will execute despite the stalled >memory access, but eventually that access is needed to continue execution... > >-Matt Certainly correct. Wasn't thinking at all when I wrote the above. Although interleaving _should_ drive overall latency down since latency is additive. But with a single cpu, perhaps it doesn't work that way as the second memory read can't be started until the first has completed, so that reducing the delivery time (after the initial latency) doesn't speed up the next read since it hasn't even been scheduled yet. On a dual, it is a big deal, of course.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.