Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: hardware question (SDRAM or DDRAM?)

Author: Vincent Diepeveen

Date: 17:09:04 12/05/02

Go up one level in this thread


On December 05, 2002 at 15:26:00, Matt Taylor wrote:

>On December 05, 2002 at 09:02:16, Vincent Diepeveen wrote:
>
>>On December 04, 2002 at 18:52:19, Matt Taylor wrote:
>>
>>>Yes, 1.3% speed increase is significant when dealing with algorithm analysis,
>>>and 13% is even more incredible. However, 13% is barely significant when you're
>>>comparing the speeds of hardware, and that's what you're doing.
>>>
>>>The following is how RDRAM works -the way I understand it-. I could have some
>>>facts grossly wrong. My interests have been in the AMD platform because I can
>>>build a cluster that operates much faster for the same cost. That said, I am
>>>-fairly- certain that I have all my facts straight about RDRAM.
>>
>>How fast is your cluster RAM latency?
>>
>>Are we talking about a default 1Gbit network
>>with milliseconds latency or so really not capable of running
>>programs that do some inter process communications, or
>>something faster than that?
>
>Yes, I'm talking about parallel problems that don't need interprocess
>communication. Chess isn't the only thing that's computable, and it's not the
>only thing I enjoy computing. :-)

under 'cluster' i usually imagine that i/o speed == memory speed
for the cluster. that the case here too?

DIEP runs on cc-NUMA SGI systems easily. that's however having quite
some bandwidth. about 1 terabyte/second in case of TERAS machine.

Of course that's for 1024 processors. Quite a bit.

>>>No RDRAM part operates on a 100 MHz clock. The pc800 part operates at 400 MHz on
>>>a DDR bus with a width of 2-bytes. This yield a maximum bandwidth of 400*2*2 =
>>>1.6 GB/sec. The pc1066 part operates at 533 MHz on the same bus, and 533*2*2 =
>>>2.1 GB/sec. The P4 FSB is a 16-byte DDR bus running at 100 or 133 MHz (100*2*16
>>>= 3.2 GB/sec, 133*2*16 = 4.2 GB/sec).
>>
>>This is not true for latency. You are quoting bandwidth calculation here.
>>
>>It operates at 100Mhz internally but it is quad pumped.
>>
>>That quad pumped increases bandwidth, but not latency.
>>
>>So for latency you must face the fact that it is 2 times slower.
>>
>>For latency DDR ram is 2 times faster: it is 133/100 * (15T/10T) = 2.0
>>times faster.
>>
>>This is why DDR ram is way faster for me than RDRAM at the same
>>processor.
>>
>>Note that the bandwidth i do not believe either, but as i said before
>>we can discuss forever here. If you go do some big matrix calculation
>>(say from 2 gigabyte) where bandwidth is important, then the testresults
>>i see is that DDR ram has bigger practical bandwidth.
>>
>>theoretical it is clear that RDRAM has more bandwidth.
>>
>>But i do not want to get into that discussion, it is an endless discussion.
>>
>>For latency things is very clear.
>>
>>If you still do not understand it, then test it yourself.
>
>I do agree about the latency bit, though I am not sure why you use the pc800
>part instead of the pc1066 part. I don't think latency has ever been in dispute.
>RDRAM has been criticized for that from the very beginning.
>
>I am not sure how the RDRAM bandwidth discussion is endless; I have never seen
>anyone claim that RDRAM has lower bandwidth than DDR. I'm not talking about
>lower performance in some application -- I'm talking about the synthetic tests
>that measure the amount of bandwidth of RDRAM vs. DDR. Certainly my own tests
>parallel everything I have read.
>
>A synthetic benchmark is important here because we want to measure performance
>of the hardware to use as a predictor for performance in a real-world
>application. If we wanted to measure real-world performance of the application,
>we would take his end-game database software and run it on two systems and
>compare throughput.
>
>>>Note that the P4 has a higher FSB speed than the single RDRAM chip. This is
>>>intentional. What good chipsets do is issue parallel requests. This means that 2
>>>RDRAM modules get twice the bandwidth of a single RDRAM module. Coincidentally,
>>>you are required to add them in pairs. The same technique -can- be applied to
>>>DDR, but at present I have not heard of this. (This is why I don't have to
>>>purchase my DDR modules in pairs -- much to my relief.)
>>
>>You have pretty old knowledge then. try AMD 760MPX chipset which requires
>>also 2 modules.
>
>I own an AMD 760MPX-based board, and I am currently running off of (1) Samsung
>pc2100 1GB CL=2.5 Reg/ECC module. At home I use a Tyan Tiger MPX, and at work I
>have an Iwill MPX2. Both are based on the AMD 760MPX chipset.
>
>I borrowed an Unregistered Samsung 256 MB module and paired it with an
>Unregistered Micron 256 MB module that I own, and my bandwidth is within ~10% of
>what I had from the single DIMM. My machine at work uses 2 512 MB CL=2.5 Micron
>modules, and its bandwidth is within 20-30% of what I get at home. If the
>chipset had any sort of mux, I should see more than 20-30% gain.
>
>The bandwidth calculation is done by copying large amounts of memory, a fairly
>standard algorithm. Consistent, repeatable results (within 1%) in addition to
>confirmation from other tests reassure me that the test is accurate.
>
>>>Chip-for-chip, DDR modules may sustain higher transfer rates, but I assure you
>>>that empirical data shows RDRAM systems winning the bandwidth war.
>>
>>No.
>>
>>This looks like a RAMBUS propaganda talk you write down here.
>>
>>For DIEP DDR ram, even at the P4, it is 13% faster or something than
>>RDRAM.
>>
>>This where the cpu speed is most important for DIEP.
>
>DIEP uses hash-tables. DIEP is latency-dependent. DIEP will run a little slower
>on RDRAM because it has higher latency.
>
>>>The P4 Williamette system I used at work for a while had a practical bandwidth
>>>of 2.8 GB/sec on pc800 RDRAM, a 100 MHz bus. The faster DDR-based boards use
>>>pc2700 which, as I understand, really isn't standard. This is a maximum
>>>theoretical bandwidth lower than what I have measured. I have a dual AthlonMP
>>>1600 with about 1.3 GB/sec bandwidth. The SMP factor is probably biasing the
>>>measurement, but in either case I'm not convinced that there is any DDR system
>>>that even matches that old P4 on RDRAM.
>>
>>Why not buy some chess software and compare a P4 RDRAM system with DDR ram.
>
>Chess software isn't necessarily an accurate benchmark of bandwidth, which is
>why synthetic benchmarks aren't necessarily accurate for chess. The original
>question wasn't, "Which type of ram runs chess faster?" The question was, "Which
>type of ram performs best in end-game computation?"
>
>For the computation of the database, the bandwidth is likely the most important
>factor, particularly since the WC memory type allows a lazy-commit style of
>memory writing. For table queries like DIEP uses, DDR will probably be faster.
>
>>>-Matt
>>>
>>>On December 04, 2002 at 18:19:43, Vincent Diepeveen wrote:
>>>
>>>>On December 04, 2002 at 17:40:13, Matt Taylor wrote:
>>>>
>>>>i hope you realize that good programmers/designers work months
>>>>to get 1.3% speedup. Both chessprogrammers for their program
>>>>and hardware designers for their chips.
>>>>
>>>>13% is really a lot then if you understand that the speed
>>>>of DIEP isn't depending only upon memory speed, but even more
>>>>upon processor speed.
>>>>
>>>>So the actual speedup of DDR ram over SDRAM in latency is
>>>>more like 100% faster, which is actually true.
>>>>
>>>>DDR ram needs 10T versus RDRAM 15T. That's already 50%.
>>>>
>>>>RDRAM initially was clocked 100Mhz and
>>>>the DDR ram is clocked 133Mhz.
>>>>
>>>>Nowadays there is also RDRAM clocked to higher speeds than 100Mhz
>>>>(quad pumped of course), but still it is of course 50% slower in
>>>>timing than DDR ram.
>>>>
>>>>So where RDRAM might win it nowadays perhaps on bandwidth (tests
>>>>which try to pump actual terabytes of data through the ram suggest
>>>>that fastest DDR ram can pump through more than fastest RDRAM,
>>>>despite theoretical specifications of the RDRAM versus theoretical
>>>>specifications of DDR ram, but i don't want to get in the middle
>>>>of a battle there which is getting fought out non-stop; and the
>>>>truth is simply that you have to choose to believe either technical
>>>>specifications or the actual tested speeds by experts so it is
>>>>a forever 'yes' 'no' fight), there is not a single doubt on
>>>>what is the better latency.
>>>>
>>>>DDR ram has over 50% faster latency than RDRAM. This is very clear.
>>>>The bus of most of the tested old P4s was 100Mhz, versus K7 soon
>>>>already 133Mhz. So also that speed difference we must take into
>>>>account.
>>>>
>>>>If that total of 1.33 * 1.5 = 2.0 times faster latency is
>>>>then giving a 13% speedup of DIEP, then that is quite a lot IMHO.
>>>>
>>>>>On December 04, 2002 at 13:32:01, Vincent Diepeveen wrote:
>>>>>
>>>>>>On December 04, 2002 at 11:42:17, Matt Taylor wrote:
>>>>>>
>>>>>>>On December 04, 2002 at 10:43:59, Vincent Diepeveen wrote:
>>>>>>>
>>>>>>>>On December 04, 2002 at 10:21:08, James T. Walker wrote:
>>>>>>>>
>>>>>>>>>On December 04, 2002 at 08:00:35, martin fierz wrote:
>>>>>>>>>
>>>>>>>>>>hi,
>>>>>>>>>>
>>>>>>>>>>i'm on the lookout for a new PC for endgame database computations. i'll probably
>>>>>>>>>>be buying a lot of ram, 2-3GB. i see that there is a big price difference
>>>>>>>>>>between DDRAM and SDRAM. IIRC the main difference is that you get a larger
>>>>>>>>>>bandwidth, but about the same latency with DDR - so i suppose i'm better off
>>>>>>>>>>buying SDRAM for my application. any opinions of the experts?
>>>>>>>>>>
>>>>>>>>>>thanks in advance
>>>>>>>>>>  martin
>>>>>>>>>
>>>>>>>>>For what it's worth:  I purchased one stick (256M) of DDR ram to compare to my
>>>>>>>>>cheap SDRAM.  I found no noticable difference in chess performance (just price).
>>>>>>>>> I did not do any extensive testing.  I simply compared Fritz marks.  I suspect
>>>>>>>>>that in the future most motherboards will not accept the SDRAM.
>>>>>>>>>Jim
>>>>>>>>
>>>>>>>>I see a big difference. 64 versus 32 bytes cache lines matters
>>>>>>>>a lot for DIEP and all software that doesn't fit within L1 cache.
>>>>>>>>
>>>>>>>>Best regards,
>>>>>>>>Vincent
>>>>>>>
>>>>>>>Cache line size is a part of the CPU, not the ram. There are a number of
>>>>>>>transitional products, both P4 and Athlon, that accept both SDRAM and DDR SDRAM.
>>>>>>>(However, I have never heard of anyone happy with these products.)
>>>>>>
>>>>>>the P4 ended up being a lot faster for DIEP when i tested a p4 with ddr ram
>>>>>>isntead of RDRAM.
>>>>>>
>>>>>>P4 with ddr ram (northwood) is like 1.5 : 1 for a K7
>>>>>>used to be 1.7 : 1 to a k7 with rdram.
>>>>>>
>>>>>>So 1.7 Ghz P4 rdram == 1.0Ghz K7 for DIEP
>>>>>>   2.4 Ghz P4 ddr   == 1.6Ghz K7 for DIEP (both ddr).
>>>>>>
>>>>>>DDR is a big step forward!!
>>>>>>
>>>>>>i don't know where the processor gets 64 bytes instead of 32 bytes in
>>>>>>the design. I just know it gets 64 bytes, versus SDRAM 32.
>>>>>>
>>>>>>Best regards,
>>>>>>Vincent
>>>>>
>>>>>By your figures, DDR SDRAM speed compared to RDRAM speed on a P4 platform is
>>>>>1.7/1.5 = 113%. I wouldn't call 13% a "big step forward."
>>>>>
>>>>>This also makes the assumption that both the 1 GHz K7 and 1.6 GHz K7 run equally
>>>>>fast. The 1 GHz K7 is the Thunderbird chip. The 1.6 GHz K7 is the AthlonXP 1900.
>>>>>Thunderbirds report that they are model 4, whereas AthlonXP 1900 may report
>>>>>model 6 (palomino) or 8 (thoroughbred). Model 4 and Model 6 are not the same
>>>>>thing, and they differ in MORE than just instructions. One change that I have
>>>>>observed is that the model 6 L2 cache is slightly faster. Other timings have
>>>>>probably changed, too.
>>>>>
>>>>>I will also mention that a 2.4 GHz P4 is the P4 Northwood. The 1.7 GHz P4 may be
>>>>>a Northwood, but I suspect (based on the numbers) that it was probably the older
>>>>>Williamette. The major difference is that the P4 Williamette had a smaller L2
>>>>>cache (256KB instead of 512KB).
>>>>>
>>>>>I will have to agree with Jeremiah, here. If DDR SDRAM is faster, DIEP is
>>>>>latency-dependant. If RDRAM is faster, it would be bandwidth-dependant. I have
>>>>>measured pc800 RDRAM bandwidth on one of my systems, and it exceeds theoretical
>>>>>bandwidth on any standard part DDR SDRAM. (I am not completely sure, but I don't
>>>>>think pc2700 is part of the JDEC specification.)
>>>>>
>>>>>I am not sure what you're saying about 64-bytes vs. 32-bytes, but I assure you
>>>>>that SDRAM-based, DDR-based, and RDRAM-based P4s all have the cache line size.
>>>>>The information is available from the cpuid instruction. The vector is
>>>>>documented in both Intel and AMD literature, but off-hand I don't know which
>>>>>vector it is. There are many utilities, especially for Windows, that will give
>>>>>this information. I -believe- wcpuid is one such utility, but I usually end up
>>>>>writing a program every time I get curious about cpuid information.
>>>>>
>>>>>If you would like, I will write such a program and post it.
>>>>>
>>>>>-Matt



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.