Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Source code to measure it - there is something wrong

Author: Robert Hyatt

Date: 21:31:28 07/15/03

Go up one level in this thread


On July 16, 2003 at 00:02:35, Keith Evans wrote:

>On July 15, 2003 at 22:25:01, Vincent Diepeveen wrote:
>
>>On July 15, 2003 at 20:58:18, Keith Evans wrote:
>>
>>>On July 15, 2003 at 20:30:04, Vincent Diepeveen wrote:
>>>
>>>>On July 15, 2003 at 20:08:57, Robert Hyatt wrote:
>>>>
>>>>>On July 15, 2003 at 17:58:01, Gerd Isenberg wrote:
>>>>>
>>>>>>Ok, i think there is one problem with Vincent's cache benchmark.
>>>>>>
>>>>>>There are two similar functions DoNrng and DoNreads. DoNrng is used to mesure
>>>>>>the time without hashread. But the instructions has the potential of faster
>>>>>>execution due to less dependencies and stalls. It may execute parts of two loop
>>>>>>bodies of DoNrng interlaced or simultaniesly - that is not possible in DoNreads.
>>>>>>Therefore the time for N DoNrng is not the time used inside the N DoNrng loop,
>>>>>>and maybe much faster.
>>>>>
>>>>>That is also certainly possible.  This kind of "problem" is highly
>>>>>obfuscated, as you can see.  It requires a lot of analysis, by a lot of
>>>>>people, to see the flaws.  That's why lm-bench is so respected.  It was
>>>>>written, a paper was written about it, another paper was written that
>>>>>pointed out some flaws, some of which were fixed and some of which were
>>>>>not really flaws.  But it has been pretty well looked at by a _lot_ of
>>>>>people.
>>>>>
>>>>>Other latency measures may well be as accurate, but until they "pass the
>>>>>test of time and exposure" they are hard to trust.
>>>>
>>>>For sure my test shows that it isn't 130 ns. It's more like 280 ns for 133Mhz
>>>>DDR ram. not sure whether you got RDRAM in your machine or 100Mhz DDR ram. but
>>>>you for sure aren't at 130ns random memory latency there.
>>>>
>>>>If instructions get paired better or worse is not real interesting. It is nice
>>>>when it measures in 0.1 ns accurate but if it is an error of 0.5 ns like it is
>>>>now (assuming no other software is disturbing) then that is not a problem for me
>>>>knowing the actual latencies lie in 210 for 150Mhz ram (just 300MB cache which
>>>>is definitely too little) to 280 for 133Mhz ram (with 500MB cache) at P4 to
>>>>nearly 400 ns for dual P4/K7s with DDR ram 133Mhz.
>>>
>>>Vincent,
>>>
>>>What do you think is wrong with the lmbench lat_mem_rd (memory read latency)
>>>benchmark?
>>>
>>>Keith
>>
>>That's measuring the sequential latency. So if you first read in an
>>array[60000000] the first 8 bytes then the bytes 8..15 then bytes 16..23 and so
>>on. That is faster for memory.
>>
>>However in computerchess we do not lookup position 0 1 2 3 4 5 6 in memory, but
>>we search. So we get semi random lookups which are unpredictable of course.
>>
>>So then you get confronted with extra latency for technical RAM reasons. It is
>>therefore interesting for computerchess to measure the average random latency.
>>Of course like Gerd says the real latency is even cooler but it won't be far off
>>from the RASML test.
>>
>>Best regards,
>>Vincent
>
>Can't you increase the stride size in lmbench to get around this?


You don't need to.  lm-bench runs at 128 byte stride by default, and that
is beyond any L1/L2 linesize on any pentium-type machine made so far.

It also tests to measure the cache line size to avoid producing data that
would be badly skewed.



This page took 0.03 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.