Author: J. Wesley Cleveland
Date: 18:58:18 07/18/03
Go up one level in this thread
On July 18, 2003 at 21:17:14, Robert Hyatt wrote: >On July 18, 2003 at 15:21:35, J. Wesley Cleveland wrote: > >>On July 17, 2003 at 18:25:51, Robert Hyatt wrote: >> >>>On July 17, 2003 at 17:35:33, Dieter Buerssner wrote: >>> >>[snip] >>>> >>>>I cannot find any randomness in the reads of lm-bench (I downloaded latest >>>>stable source today, not the experimental version, available, too). If it would >>>>do random reads, it would have no way to avoid the problem with the TLBs you >>>>explained. >>> >>>4M pages solves it for at least 250mb worth of RAM. But then again, _no_ chess >>>program depends on purely random memory accesses to blow out the TLB. The only >>>truly random accesses I do are the regular hashing and pawn hashing, which >>>both total to significantly less than the total nodes I search. Which means >>>the TLB penalty is not even 1% of my total run time. Probably closer to >>>.01% - .05%. >>> >>>I ignore that. >> >>Why do you think it is that low? I get ~20-30% of nodes have hash probes with >>crafty. > > >Look at the code. I not only looked at the code. I *instrumented it*. I won't have complete results until Monday, but it appears that crafty spends 3-5% of its total time inside hashprobe on my (slow) machine and a prefetch could reduce that by about half. >Crafty probes memory _once_ for a hash probe. That >introduces a memory access penalty once per node in the basic search, >less than once per node in the q-search (I only probe phash there and I >don't probe it but about 25% of the q-search nodes I visit). If you had read whai I wrote, you would see I said crafty does a hash probe 20-30% of its total nodes. >As a result, you get less than one probe per node searched. A node searched >requires something on the order of 3000-5000 instructions. What percentage >of that 3K-5K instruction timing is that single hash probe? Almost zero. Except that a fast machine may do these 3-5K instructions in <1usec. A cache miss + a TLB miss may take 300-400 ns. I would not call 30% almost 0. > >Ignore hits and misses, that is not the issue here. The issue is the cost of >doing the probe itself, which is essentially zero. > > > > >>If you are getting 1m nodes/sec, then this is a probe every 3-5 usec. >>With a very large hash table and 4K pages, the large majority of these will >>cause a TLB miss. At 200 nsec each (a guess), this could be up to 5% of your >>total run time. > >See above. I don't really probe once for every node. See above. I never said you did.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.