Author: Robert Hyatt
Date: 20:45:16 07/18/03
Go up one level in this thread
On July 18, 2003 at 21:58:18, J. Wesley Cleveland wrote: >On July 18, 2003 at 21:17:14, Robert Hyatt wrote: > >>On July 18, 2003 at 15:21:35, J. Wesley Cleveland wrote: >> >>>On July 17, 2003 at 18:25:51, Robert Hyatt wrote: >>> >>>>On July 17, 2003 at 17:35:33, Dieter Buerssner wrote: >>>> >>>[snip] >>>>> >>>>>I cannot find any randomness in the reads of lm-bench (I downloaded latest >>>>>stable source today, not the experimental version, available, too). If it would >>>>>do random reads, it would have no way to avoid the problem with the TLBs you >>>>>explained. >>>> >>>>4M pages solves it for at least 250mb worth of RAM. But then again, _no_ chess >>>>program depends on purely random memory accesses to blow out the TLB. The only >>>>truly random accesses I do are the regular hashing and pawn hashing, which >>>>both total to significantly less than the total nodes I search. Which means >>>>the TLB penalty is not even 1% of my total run time. Probably closer to >>>>.01% - .05%. >>>> >>>>I ignore that. >>> >>>Why do you think it is that low? I get ~20-30% of nodes have hash probes with >>>crafty. >> >> >>Look at the code. >I not only looked at the code. I *instrumented it*. I won't have complete >results until Monday, but it appears that crafty spends 3-5% of its total time >inside hashprobe on my (slow) machine and a prefetch could reduce that by about >half. > >>Crafty probes memory _once_ for a hash probe. That >>introduces a memory access penalty once per node in the basic search, >>less than once per node in the q-search (I only probe phash there and I >>don't probe it but about 25% of the q-search nodes I visit). > >If you had read whai I wrote, you would see I said crafty does a hash probe >20-30% of its total nodes. OK. I clearly mis-read what you meant. the 20-30% was eye-catching as that is a pretty common hash hit percentage as well... > >>As a result, you get less than one probe per node searched. A node searched >>requires something on the order of 3000-5000 instructions. What percentage >>of that 3K-5K instruction timing is that single hash probe? Almost zero. > >Except that a fast machine may do these 3-5K instructions in <1usec. A cache >miss + a TLB miss may take 300-400 ns. I would not call 30% almost 0. You are missing my point. In the position(s) you tested, you saw 20-30% hash probes. That means one probe for every 3-5 nodes. At 1M nodes per second, that is 200K-300K probes per second. If you measure the time spent in searching a single node, multiply that by 3-5X, then compare that to the hash probe time, the time spent probing the hash table is low. Note that your 5% is _not_ the total time used to probe the table. It is the time to probe the table, and do it _twice_ although the second probe doesn't have any memory access penalty associated with it in most cases. So a big percent of that 5% is doing the actual work done in HashProbe(), rather than being all memory access penalty... >> >>Ignore hits and misses, that is not the issue here. The issue is the cost of >>doing the probe itself, which is essentially zero. >> >> >> >> >>>If you are getting 1m nodes/sec, then this is a probe every 3-5 usec. >>>With a very large hash table and 4K pages, the large majority of these will >>>cause a TLB miss. At 200 nsec each (a guess), this could be up to 5% of your >>>total run time. >> >>See above. I don't really probe once for every node. > >See above. I never said you did.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.