Author: Robert Hyatt
Date: 09:02:40 12/16/02
Go up one level in this thread
After some off-the-wall comments by Vincent, I decided to resurrect a small project I had put aside a while back. As most know, I use the "belle hash approach". Two tables, one depth-preferred where the entry with the deepest "draft" is kept and the other "always store" where we store there if we can't store in the first due to insufficient draft. This second table is also 2X the size of the first. The problem with the two-table approach is that the two entries are definitely going to end up in two separate cache lines, because the two table entries are not contiguous. Vincent claimed this was a huge penalty. Here's the facts: First, I had previously modified my two-table approach to combine things into one big table. Each "entry" now has three entries. The first entry is the same depth-preferred entry as previously used. The other two are "always store" and use a single bit in the hash signature to choose which one is used. This means that the two algorithms use _exactly_ the same number of entries, and that the entries are probed in _exactly_ the same way. Except that now, the three entries are 48 consecutive bytes in memory, and this means that I am now going to average 1.5 cache misses rather than 2.0 as with the old approach. 1/2 of the time the depth-preferred entry will be the first or second sixteen bytes in a 64 byte L1 cache line. This will cause the two always-store entries to be pre-fetched since they will be part of the same cache line. the other 1/2 of the time, the depth-preferred entry will be in bytes 32-47 or 48-63. If the latter, then the always- store entry will be a cache miss 100% of the time, and for the 32-47 byte case, 1/2 of the time the needed always_store entry will be a hit, and 1/2 of the time it will be a miss. Sounds good. I ran the test. I will give a few NPS values for comparison: (in each case, the first nps is from the _old_ approach, the second nps is from the new (supposedly faster) approach.) nps=911k nps=914k nps=1400k nps=1425k nps=1325k nps=1353k I think the conclusion should be pretty clear. There is barely enough difference to measure, basically. going from 1325 to 1353 is a 2% speed increase. The point being that as I have said _many_ times, the time spent in probing the hash table is not a significant issue. Vincent claims otherwise. The two-table (separate tables) is hardly any less efficient than one combined table. This is the same sort of conclusion I reached in the famous "your smp-lock approach sucks" argument. It is better to _test_ rather than to _guess_.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.