Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Anandtech on the FSB/memory bandwidth balance

Author: Matt Taylor

Date: 17:59:49 01/03/03

Go up one level in this thread


On January 03, 2003 at 13:22:42, Anthony Martini wrote:

>On January 03, 2003 at 07:02:44, Peter Kasinski wrote:
>
>>On January 03, 2003 at 03:20:55, Matt Taylor wrote:
>>
>>>On January 02, 2003 at 22:48:24, Anthony Martini wrote:
>>>
>>>>On January 02, 2003 at 16:48:15, Dan Andersson wrote:
>>>>
>>>>>The thing is that RDRAM is in hardware purgatory. Rambus pretty much managed to
>>>>>piss each and every company involved in computer motherboards off. The current
>>>>>situation is that Intel is going with DDR400 memory. The situation may change in
>>>>>the future. But currently they are fading fast. And RDRAM isn't all sunshine and
>>>>>roses either.
>>>>>
>>>>>MvH Dan Andersson
>>>>
>>>>   Dan is right, Rambus has pissed a lot of people off... RDRAM 1066 is the
>>>>fastest memory out there (DDR2 is on the horizon), but it is more expensive and
>>>>requires a more expensive motherboard... I was online in OCT this year looking
>>>>for computers, and at the DELL Factory Outlet they were showing systems w/RDRAM
>>>>1066... In all tests that I know of, RDRAM is faster, but most OEM's are
>>>>currently using DDR...
>>>>
>>>>   Try this site--> http://www.tomshardware.com/
>>>>
>>>>  -,
>>>>     Anthony
>>>
>>>There is no universally faster solution. It always depends on what you are
>>>doing. RDRAM is really nice when you have a lot of serial computation to do.
>>>DDR-II will obsolete RDRAM as they will then have the same bandwidth. DDR-III is
>>>also planned.
>>>
>>>I would not be suprised if Tom's Hardware claimed that DDR SDRAM is slower than
>>>regular SDRAM. I do remember some of the comparisons they made between RDRAM and
>>>DDR SDRAM in the beginning, and I was thoroughly unimpressed. Poor technique,
>>>lack of facts, etc. Better to look for a site where the author knows what he is
>>>talking about.
>>>
>>>-Matt
>>
>>I tried this: http://www.anandtech.com/showdoc.html?i=1615&p=3
>>
>>Granted, this is from May, but it discusses what the "quad-pumped" 533 bus speed
>>means from the memory perspective.  Their conclusion seems to be that unless
>>PC1066 RDRAM is used the tru thruput of the new systems wouldn't be realized.
>>Of course, the DDR-III would alter that assessment.
>>
>>My real question remains:  Would a Dell dual Xeon at 2.8 GHz (533) not be _by
>>definition_ impeded by its use of the DDR SDRAM today?
>>
>>Thanks again,
>>PK
>
>   Yes, I would definately think it would be impeded by DDR today. Memory
>bandwidth is the biggest bottleneck in thruput today - even more than processor
>speed. If I had a Dell dual Xeon 2.8 Ghz, I would want 1066 RDRAM instead of
>DDR. However, I wouldn't spend that kind of money on a system. One could get a
>new computer ever 2 years, instead of 4 for that price (which, I personally do)-
>esp. w/the new 64-bit processors on the horizon...
>
>   -,
>       Anthony

It still depends on your application. Let's consider two basic routines: memset
and a code that uses a hash.

void memset(char *ptr, char fill, size_t len)
{
    for(size_t i = 0; i < len; i++)
        *ptr++ = fill;
}

void sum_something(int *hash, int *indices)
{
    int sum = 0;

    while(*indices != -1)
        sum += hash[*indices++];
}

The function parameters and local variables will not impede performance too much
here becuase they reside in the cache. Presuming that ptr in memset and hash in
sum_something do not reside in the cache (they are pretty big, say 16 MB), these
two functions have different requirements.

In memset, Pentium 4 will fetch the next 128 bytes, and it takes a large number
of clocks (>100) to fetch the next block of memory. The memset routine will
operate on the data in the cache while the next block is being fetched, and a
good amount of the memory latency is hidden because it's doing useful work. When
that work is over, it doesn't have to wait very long to get the next block
because the Pentium 4 already fetched it.

In the latter function, the Pentium 4 is incapable of figuring out which part of
the hash (which is quite big) is going to be required next. For EVERY iteration,
the Pentium 4 is going to have to go out to main memory and fetch a part of
hash. The code can't do anything in between. The biggest bottleneck here will be
the number of cycles it takes to fetch that part of hash.

In the former case, RDRAM wins big because it can deliver more data. In the
latter case, DDR wins big because it can deliver data 33-50% faster. This is a
classic problem. The answer is, when you can do things in parallel, the latency
is trivial. When you can't, you get screwed by it.

In the case of dual-CPU systems, the quad-pumping they do will change the
outcome a bit. I would expect that Intel also uses quad-pumped DDR chipsets. I
have not read much on the subject because I don't own a Pentium 4. Someone else
might be able to shed some light here.

-Matt



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.