Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Rehashing (was Re: Hash replacement question)

Author: Robert Hyatt
Date: 10:14:43 01/15/98
On January 15, 1998 at 10:27:28, Stuart Cracraft wrote:

>The research I've read indicates that a two-tiered
>structure with nodes-based-replacement for the one
>slot that most people check for subtree height is
>preferred.
>
>That is, the number of nodes that were searched
>for this subtree must be greater than the number of
>nodes recorded for the position already in the hash
>table.

I tried this when the discussion hit r.g.c.c...  for me, it didn't
work.  If you don't do many extensions, this might pan out.  But if
you do, node counts will vary based on how many extensions you do,
not necessarily on how much work was done below that position.  IE
I once counted such info in Cray Blitz when we were discussing this a
while back in r.g.c.c, and found two sibling nodes where one had one
node below it examined, the other had one million nodes examined,
purely because of the extensions (singular and others) that I do/did
in CB.  I found that depth-preferred seemed to work better for me at
the time.  Of course, it might have changed, but I tried it on two
different programs (CB and Crafty) and depth-preferred was always just
a little more efficient than nodes-preferred replacement.


>
>Recently, I replaced my single-tier depth-based
>replacement with single-tier nodes-based and
>experienced a small improvement.
>
>Later I will go to a two-tier algorithm so that new
>things always get stored.
>
>Also, I think we need to discuss rehashing. Does somenoe
>have this implemented? It seems like rehashing for say
>8 slots based on nodes-recorded would be better than
>single-tier since quite a few collisions could be handled
>without replacing information already at that slot.
>

Rehashing is better.  In Cray Blitz, we used 8 slots.  Basically,
we took the hash key and divided it into three fields:

<lock><index><table_address>

<table_address> is just what you'd expect... the address of the
initial table probe.  <index> is added to table_address 7 times to
produce the other 7 entries in this "eight entry bucket" approach.
<lock> is simply the "rest" of the hash signature.

This has two good features:  (1) you get to choose between replacing one
of eight entries, sort of an 8-way set-associative hashing algorithm;
(2) using the <index> field tends to help avoid "chaining" if you know
what that is all about.

This doesn't work well on a PC however, because the 8 loads on the Cray
were essentially free after the first 1.  On a PC, we have such poor
memory bandwidth, the extra loads place a strain on the memory interface
that will slow things down more than reduced node count will offset.

Don Beal wrote a paper in the JICCA explaining this, but he only
considered
nodes searched.  In that case, probing 4 addresses performed better than
only one, or than the two-tiered approach.  But *only* when the table is
quite saturated.  I avoid most of this by not hashing q-nodes, so I
don't
run into table saturation in normal circumstances...



>Does anyone have some good algorithms for this? Let's say
>I hash to index X and find it is filled. What is a good
>algorithm to choose another index to try? And it seems like
>that algorithm would have to be deterministic because the
>next time into this entry, I'd want to go search those
>alternate indexes/slots in the same order to find where I'd
>stored it.
>
>--Stuart


use the Cray Blitz approach.  the danger is that when you hash
to entry X, you try X, X+n, X+2n, etc.  If you later hash to Y,
and either Y, Y+n, Y+2n, ... hit one of the X's above, they will
"match" from that point forward.  In the CB approach, n would
likely be different for two different hash signatures, so that
the index between "bucket entries" would not be a constant, giving
better and more uniform distribution of entries..
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.