Author: Robert Hyatt
Date: 12:43:41 01/19/99
Go up one level in this thread
On January 19, 1999 at 13:06:33, Bruce Moreland wrote:
>
>On January 19, 1999 at 08:13:52, Robert Hyatt wrote:
>
>>On January 18, 1999 at 19:01:01, Dann Corbit wrote:
>
>>(2) use some form of hash 'buckets'. IE in cray blitz,
>>I took the low order N bits of the hash signature as the probe address. I took
>>the next 10 bits as a 'rehash offset'. (call this RO). I then checked the
>>positions at offset, offset+RO, offset+2RO, ... offset+7RO. Either approach is
>>ok. The latter tends to solve the 'chaining' problem because two different
>>hash keys identical in the last N bits (probe address) still may be different in
>>the next 10 bits, giving a different 'bucket'.
>
>I have never understood this logic, and would like to discuss it.
>
>The situation is that you get a collision at one point in the table, and the
>question is where to rehash to.
>
>I would like to ask why not just set RO to some constant, for instance one.
>You've got a collision in a table that will tend to be full, I don't understand
>why the element at some random location in the table is more likely to be empty
>or unimportant than the one at the next index in the table.
>
> for (;;) {
> if (FOverwrote(index))
> return;
> index = (index + 1) % tabsize;
> }
>
>As opposed to:
>
> ro = (key >> N) & 2047;
> for (;;) {
> if (FOverwrote(index))
> return;
> index = (index + ro) % tabsize;
> }
>
>I don't see what you get for your extra instructions.
>
>bruce
This does two things... the most important of which is to eliminate 'chaining'
in the probing/storing. IE if you hash to entry N in the table, and you use
a constant offset, you get a series of entries N, N+R, N+2R, ..., N+7R. If
another key hashes to N+2R (different low order n bits), now you use the bucket
N+2R, N+3R, ..., N+9R. Notice that those two have 6 entries in common.
In Cray Blitz, or in the approach commonly called 'rehashing' the first bucket
is still N, but now R is a variable extracted directly from the hash key. So
we probe at N, N+R1, N+2R1 ... N+7R1, and for the second try, we go to N+2R1
(I will stick with the second entry maps to N+2R from above), but we now
probe N+2R1, N+2R1+R2, N+2R1+2R2, ..., N+2R1+7R7, which only has the first
entry in common, unless R1=R2 which is unlikely since 0 <= R1 <= 1023 in our
hashing scheme.
The second benefit (mainly on a Cray) is that the probes above are done as
vector operations. And now the two vector loads have different 'strides' which
means we don't line two processors up on the same group of memory banks (the
cray can use _huge_ levels of interleaving as in 256 banks and up) which means
this is significantly faster due to fewer bank conflicts between the two
processors. This turns out to be a really significant thing on a Cray, and was
the _main_ reason we did it.
Lastly it does a better job of spreading hash entries around in the table as the
first point shows, which helps make sure all the table gets used.
On a PC this is not so great. The best plan there is to test 8 consecutive
entries because of how cache is burst-loaded 32 bytes at a time. The old CB
approach results in 8 X 32byte cache line fills (8 entries, even though each
entry was only 12 bytes long), while the more 'PC-ish' approach would reference
8 entries in 96 consecutive bytes, which is only 3 cache line fills. Almost 3x
faster.
Hope that helps...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.