Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Since the CPU is what really count for Chess !

Author: Robert Hyatt

Date: 07:04:56 03/18/03

Go up one level in this thread


On March 18, 2003 at 07:38:39, Matt Taylor wrote:

>On March 18, 2003 at 00:01:44, Robert Hyatt wrote:
>
>>On March 17, 2003 at 22:59:30, Aaron Gordon wrote:
>>
>>>On March 17, 2003 at 18:47:27, Eugene Nalimov wrote:
>>>
>>>>I just run the experiment. I used 2 otherwise identical 64-bit systems, one with
>>>>3Mb of L3 cache, other with 1.5Mb. Machine with bigger cache run Crafty's
>>>>"bench" comman 12% faster (1 CPU).
>>>>
>>>>That means that
>>>>(1) Crafty's working set don't fit into 1.5Mb,
>>>>(2) For systems with cache 1.5Mb or less (i.e. for almost all x86 systems) for
>>>>Crafty memory speed matter.
>>>>
>>>>Thanks,
>>>>Eugene
>>>
>>>Those types of systems aren't what people normally use. Most people here have a
>>>Pentium 3, Athlon, Pentium 4, etc. Here is something I found with Crafty.
>>>
>>>Using the Nforce2 chipset I'm able to run the ram at speeds from 50% up to 200%
>>>(100% being synchronous) of the fsb speed. I tested 200MHz FSB (400DDR) with
>>>200MHz memory (400DDR) and 200fsb with 100MHz memory (200DDR).
>>>The difference between ~1.6gb/s memory and ~3.2gb/s memory with craftys 'bench'
>>>command was 0.14%. Yes, about one seventh of one percent.
>>
>>That might well suggest _another_ bottleneck in that particular machine....
>
>What would that be?
>
>I ran a similar test on my AthlonXP 2500 w/nForce 2 chipset. Running the memory
>bus at 100 MHz or 133 MHz didn't make a significant difference in nps. The
>processor scored around 1.12 MN/s, and it scored some 20-30 KN/s more with a 133
>MHz memory bus. The FSB was 166 MHz in both cases.
>
>-Matt

Were I guessing, I would guess the following:

1.  no interleaving, which means that the raw memory latency is stuck at
120+ns and stays there.  Faster bus means nothing without interleaving,
if latency is the problem.

2.  Crafty is dependent mainly on latency although it does a lot of reads
as well.  But if latench is the bottleneck, then a faster bus is not going
to help except for whatever boost it gets from tricks used to load a cache
line faster by streaming in data.


When a chipset really interleaves, the first reference cycle is going to
be whatever memory latency demands, but successive cycles will be faster,
as 8 byte chunks come in one bus cycle later which makes every 32 bytes
fetched faster with than without interleaving.




This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.