Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Source code to measure it - there is something wrong

Author: Ricardo Gibert

Date: 22:34:52 07/15/03

Go up one level in this thread


On July 16, 2003 at 01:01:55, Ricardo Gibert wrote:

>On July 16, 2003 at 00:31:28, Robert Hyatt wrote:
>
>>On July 16, 2003 at 00:02:35, Keith Evans wrote:
>>
>>>On July 15, 2003 at 22:25:01, Vincent Diepeveen wrote:
>>>
>>>>On July 15, 2003 at 20:58:18, Keith Evans wrote:
>>>>
>>>>>On July 15, 2003 at 20:30:04, Vincent Diepeveen wrote:
>>>>>
>>>>>>On July 15, 2003 at 20:08:57, Robert Hyatt wrote:
>>>>>>
>>>>>>>On July 15, 2003 at 17:58:01, Gerd Isenberg wrote:
>>>>>>>
>>>>>>>>Ok, i think there is one problem with Vincent's cache benchmark.
>>>>>>>>
>>>>>>>>There are two similar functions DoNrng and DoNreads. DoNrng is used to mesure
>>>>>>>>the time without hashread. But the instructions has the potential of faster
>>>>>>>>execution due to less dependencies and stalls. It may execute parts of two loop
>>>>>>>>bodies of DoNrng interlaced or simultaniesly - that is not possible in DoNreads.
>>>>>>>>Therefore the time for N DoNrng is not the time used inside the N DoNrng loop,
>>>>>>>>and maybe much faster.
>>>>>>>
>>>>>>>That is also certainly possible.  This kind of "problem" is highly
>>>>>>>obfuscated, as you can see.  It requires a lot of analysis, by a lot of
>>>>>>>people, to see the flaws.  That's why lm-bench is so respected.  It was
>>>>>>>written, a paper was written about it, another paper was written that
>>>>>>>pointed out some flaws, some of which were fixed and some of which were
>>>>>>>not really flaws.  But it has been pretty well looked at by a _lot_ of
>>>>>>>people.
>>>>>>>
>>>>>>>Other latency measures may well be as accurate, but until they "pass the
>>>>>>>test of time and exposure" they are hard to trust.
>>>>>>
>>>>>>For sure my test shows that it isn't 130 ns. It's more like 280 ns for 133Mhz
>>>>>>DDR ram. not sure whether you got RDRAM in your machine or 100Mhz DDR ram. but
>>>>>>you for sure aren't at 130ns random memory latency there.
>>>>>>
>>>>>>If instructions get paired better or worse is not real interesting. It is nice
>>>>>>when it measures in 0.1 ns accurate but if it is an error of 0.5 ns like it is
>>>>>>now (assuming no other software is disturbing) then that is not a problem for me
>>>>>>knowing the actual latencies lie in 210 for 150Mhz ram (just 300MB cache which
>>>>>>is definitely too little) to 280 for 133Mhz ram (with 500MB cache) at P4 to
>>>>>>nearly 400 ns for dual P4/K7s with DDR ram 133Mhz.
>>>>>
>>>>>Vincent,
>>>>>
>>>>>What do you think is wrong with the lmbench lat_mem_rd (memory read latency)
>>>>>benchmark?
>>>>>
>>>>>Keith
>>>>
>>>>That's measuring the sequential latency. So if you first read in an
>>>>array[60000000] the first 8 bytes then the bytes 8..15 then bytes 16..23 and so
>>>>on. That is faster for memory.
>>>>
>>>>However in computerchess we do not lookup position 0 1 2 3 4 5 6 in memory, but
>>>>we search. So we get semi random lookups which are unpredictable of course.
>>>>
>>>>So then you get confronted with extra latency for technical RAM reasons. It is
>>>>therefore interesting for computerchess to measure the average random latency.
>>>>Of course like Gerd says the real latency is even cooler but it won't be far off
>>>>from the RASML test.
>>>>
>>>>Best regards,
>>>>Vincent
>>>
>>>Can't you increase the stride size in lmbench to get around this?
>>
>>
>>You don't need to.  lm-bench runs at 128 byte stride by default, and that
>>is beyond any L1/L2 linesize on any pentium-type machine made so far.
>
>
>For the PIV the L2-cache cache line size is 128 bytes (divided into 2 64 byte
>sectors). The L1-cache line size is 64 bytes.


In which case a 128 byte stride is just right. Nevermind.


>
>
>>
>>It also tests to measure the cache line size to avoid producing data that
>>would be badly skewed.



This page took 0.02 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.