Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hash Table Collisions

Author: Andrew Dados

Date: 08:32:39 04/11/00

Go up one level in this thread


On April 11, 2000 at 11:04:35, Colin Frayn wrote:

>This has been bugging me for a while.
>
>I've read a few articles and messages recently talking about hashtables, and how
>different people implement them and deal with collisions.
>
>Basically, the idea is to take the hash key, and then somehow map it
>non-injectively onto the (presumably smaller) hash space by taking the remainder
>of a modulus division.  Problem with this is of course that you get collisions,
>whereby two _different_ keys map onto the _same_ hash index.  This isn't the
>same as two different _positions_ mapping onto the same hash _key_, but is due
>to the many-to-one mapping from the key to a hash index.
>(the mapping to a hash key had better be bijective or your program is gonna
>start churning out strange results :) )
>
>As far as I can see, many people get round this by choosing whether to keep the
>original entry or to replace it with the new entry whenever a collision occurs.
>
>Why is this?  Isn't that wasteful?
>
>The way I implement my hashtable, I have each hash entry followed by a linked
>list of hash entries which are all different positions, but all map to the same
>hash index.  Every time a position is stored in the hash table, I check to see
>if it is already there, in which case I only replace it if the new version is
>deeper.  If not, then I simply add another element at the head of the linked
>list at the correct index value with the _new_ position and its values.
>
>Admittedly this takes more space to store as I have the hash key stored along
>with the entry (because I need to distinguish between entries with identical
>hash indices, but different keys.)  but recent studies have shown that the
>number of entries in a hash table (past a certain limit) makes hardly any
>difference to the playing quality anyway.  Certainly not in Blitz.
>
>Is this method dumb? Why?  It just takes a little nasty pointer arithmetic, but
>other than that it's pretty straightforward.  The length of the list is rarely
>more than 2 or 3 entries, so it takes no time at all to traverse.
>
>Input appreciated.
>
>Cheers,
>Col

Now, suppose you run out of HT space entirely... what will you do? Discard new
positions? Or devise some replacement schema anyway?

Let's count.. suppose your program stores 100k positions/sec in HT. Given your
position storage space requires 2 pointers, it will be around 24b per position.
That gives some 2MB/sec in worse case (rough max estimate, can be actually
1Mb/sec or lower). It will not take too long to fill in entire TT....
So why not use clever replacement schema from start?

-Andrew-



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.