Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Source code to measure it - there is something wrong

Author: Robert Hyatt

Date: 20:50:01 07/15/03

Go up one level in this thread


On July 15, 2003 at 20:26:28, Vincent Diepeveen wrote:

>On July 15, 2003 at 17:58:01, Gerd Isenberg wrote:
>
>>Ok, i think there is one problem with Vincent's cache benchmark.
>>
>>There are two similar functions DoNrng and DoNreads. DoNrng is used to mesure
>>the time without hashread. But the instructions has the potential of faster
>>execution due to less dependencies and stalls. It may execute parts of two loop
>>bodies of DoNrng interlaced or simultaniesly - that is not possible in DoNreads.
>>Therefore the time for N DoNrng is not the time used inside the N DoNrng loop,
>>and maybe much faster.
>
>If you look to the generated assembly code you will see that nothing is wrong.
>If something goes wrong in the pairing of the instructions that might at most
>make a difference of 1 to 2 clockcycles.

You have to look _well_ beyond "the generated assembly code".  Did you read
his comment (of course not).

Re-read it again, paying careful attention to the "it may execute parts of two
loop bodies of DoNrng interlaced or simultaneously" and try to figure out what
that means.

It's important.

>
>At 2Ghz that's 0.5 nanoseconds.

It could easily be 50ns too...

You are just focused on the _wrong_ idea.  "pairing" is irrelevant in the
OOE of the PIV.


>
>It is cool if it measures accurate in 0.5 ns of course, but that was never the
>intention of the test. It was intended to measure latencies at big
>supercomputers when n processors read in the memory of just 1 poor node.

And there it might be accurate enough for you.  But not on a PC.


>
>the other test latencyC.c is doing a criss cross reference. A 0.5 ns error in
>the measurement there (assuming there is no other system stuff disturbing which
>there is because actually other processes may also use RAM at your node, which
>can give major differences, but still doesn't give a bug in the test) is not
>really relevant.
>
>If you care for 0.5 ns difference then shoot me, but don't ever say it's 130ns
>like bob claims for 133Mhz DDR ram.

I don't have 133mhz ddr ram, and I don't claim 130ns.  I claim 150ns for 100mhz
quad-pumped ddr ram.  And actually _I_ don't claim that.  an industry-accepted
benchmark (lm-bench) reports 150ns.  Using my Dell with ECC registered DDR
ram.


>
>You're at new fresh i bet cl2 150Mhz DDR ram if not faster.
>
>the 280 ns was measured for P4 with 533 bus or 133Mhz DDR ram and the 400 ns as
>you see is from the duals. For me 380 ns goes to 400 ns or something when using
>bigger hashtables. I see no significant difference between 380 and 400 so to
>speak. It's simply *huge* random latency.
>
>>
>>int DoNrng(BITBOARD n) {
>>  BITBOARD i=1,dummyres,nents;
>>  int t1,t2;
>>
>>  nents = nentries; /* hopefully this gets into a register */
>>  dummyres = globaldummy;
>>
>>  t1 = GetClock();
>>  do {
>>    BITBOARD index = RanrotA()%nents;
>>    dummyres ^= index;
>>  } while( i++ < n );
>>  t2 = GetClock();
>>
>>  globaldummy = dummyres;
>>  return(t2-t1);
>>}
>>
>>int DoNreads(BITBOARD n) {
>>  BITBOARD i=1,dummyres,nents;
>>  int t1,t2;
>>
>>  nents = nentries; /* hopefully this gets into a register */
>>  dummyres = globaldummy;
>>
>>  t1 = GetClock();
>>  do {
>>    BITBOARD index = RanrotA()%nents;
>>    dummyres ^= hashtable[index];
>>  } while( i++ < n );
>>  t2 = GetClock();
>>
>>  globaldummy = dummyres;
>>
>>  return(t2-t1);
>>}



This page took 0.03 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.