Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Another memory latency test

Author: J. Wesley Cleveland

Date: 12:35:17 07/21/03

Go up one level in this thread


On July 18, 2003 at 23:45:16, Robert Hyatt wrote:

>On July 18, 2003 at 21:58:18, J. Wesley Cleveland wrote:
>
>>On July 18, 2003 at 21:17:14, Robert Hyatt wrote:
>>
>>>On July 18, 2003 at 15:21:35, J. Wesley Cleveland wrote:
>>>
>>>>On July 17, 2003 at 18:25:51, Robert Hyatt wrote:
>>>>
>>>>>On July 17, 2003 at 17:35:33, Dieter Buerssner wrote:
>>>>>
>>>>[snip]
>>>>>>
>>>>>>I cannot find any randomness in the reads of lm-bench (I downloaded latest
>>>>>>stable source today, not the experimental version, available, too). If it would
>>>>>>do random reads, it would have no way to avoid the problem with the TLBs you
>>>>>>explained.
>>>>>
>>>>>4M pages solves it for at least 250mb worth of RAM.  But then again, _no_ chess
>>>>>program depends on purely random memory accesses to blow out the TLB.  The only
>>>>>truly random accesses I do are the regular hashing and pawn hashing, which
>>>>>both total to significantly less than the total nodes I search.  Which means
>>>>>the TLB penalty is not even 1% of my total run time.  Probably closer to
>>>>>.01% - .05%.
>>>>>
>>>>>I ignore that.
>>>>
>>>>Why do you think it is that low? I get ~20-30% of nodes have hash probes with
>>>>crafty.
>>>
>>>
>>>Look at the code.
>>I not only looked at the code. I *instrumented it*. I won't have complete
>>results until Monday, but it appears that crafty spends 3-5% of its total time
>>inside hashprobe on my (slow) machine and a prefetch could reduce that by about
>>half.
>>
>>>Crafty probes memory _once_ for a hash probe.  That
>>>introduces a memory access penalty once per node in the basic search,
>>>less than once per node in the q-search (I only probe phash there and I
>>>don't probe it but about 25% of the q-search nodes I visit).
>>
>>If you had read whai I wrote, you would see I said crafty does a hash probe
>>20-30% of its total nodes.
>
>OK.  I clearly mis-read what you meant.  the 20-30% was eye-catching as that
>is a pretty common hash hit percentage as well...
>
>
>>
>>>As a result, you get less than one probe per node searched.  A node searched
>>>requires something on the order of 3000-5000 instructions.  What percentage
>>>of that 3K-5K instruction timing is that single hash probe?  Almost zero.
>>
>>Except that a fast machine may do these 3-5K instructions in <1usec. A cache
>>miss + a TLB miss may take 300-400 ns. I would not call 30% almost 0.
>
>You are missing my point.  In the position(s) you tested, you saw 20-30%
>hash probes.  That means one probe for every 3-5 nodes.  At 1M nodes
>per second, that is 200K-300K probes per second.  If you measure the
>time spent in searching a single node, multiply that by 3-5X, then compare
>that to the hash probe time, the time spent probing the hash table is low.
>
>Note that your 5% is _not_ the total time used to probe the table.  It is
>the time to probe the table, and do it _twice_ although the second probe
>doesn't have any memory access penalty associated with it in most cases.
>
>So a big percent of that 5% is doing the actual work done in HashProbe(),
>rather than being all memory access penalty...

I ran some tests on my slow (450 Mhz) machine. Hash was set to 192Mb. The test
was 21 middle-game positions and ran for nearly 1 hour. Crafty got between 125k
and 230k nps. Crafty spent 3.6% of total time in HashProbe. I added the
following code just before the call to RepetitionCheck() in search.c (slightly
modified from the code in hash.c). Note that the code is basically a no-op as
all variables are local.

{
  static BITBOARD word1;
  BITBOARD temp_hashkey;
  HASH_ENTRY *htable;
/*
 ----------------------------------------------------------
|                                                          |
|   first, compute the initial hash address and choose     |
|   which hash table (based on color) to probe.            |
|                                                          |
 ----------------------------------------------------------
*/

  temp_hashkey=(wtm) ? HashKey : ~HashKey;
  htable=trans_ref_a+((int) temp_hashkey&hash_maska);
  word1=htable->word1;
}

Now crafty spends 2.8% of its time in HashProbe.



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.