Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dumb hashing question

Author: José de Jesús García Ruvalcaba

Date: 15:14:14 12/09/99

Go up one level in this thread


On December 09, 1999 at 17:41:00, Robert Hyatt wrote:

>On December 09, 1999 at 15:46:54, Dann Corbit wrote:
>
>>If I make a hash key of n bits, how can I make a smaller hash table than 2^n.
>>
>>I don't get it.  (e.g. My question about shrinking the hash table was met with
>>"It does not matter how small the table is -- it is the size of the key that
>>matters.").  This caused me to wonder why my brain is so tiny.  I mean, how can
>>I have a hash table with a great big key and not allocate enough space to hold
>>it? Do you have a smaller hash leader (half-key, quarter-key, whatever) and then
>>throw the rest into a list of some kind?
>>
>>What gives?
>
>There are lots of answers.  This is called the "collision replacement policy".
>
>We often mix the term collision badly.  Some call it the condition that occurs
>when two _different_ positions produce the same hash signature.  This is
>generally incorrect.

	What is the technically correct term for this condition? (two different
positions with the same hash signature).

> Others call it the condition that occurs when two
>different hash signatures end up needing to be stored at the same table address.
>
>The latter is the definition used by Knuth, and is good enough to answer your
>question.  If you have a 64 bit hash signature (as I do) you can take any N
>bits from that signature to address a table of size 2^N.  IE if you want 4
>entries in your table, use the right-most 2 bits of the 64 bit hash signature.
>Which gives you a value between 0 and 3, just enough to address your table.
>
>We now have two issues.  Assuming you use a 64 bit hash signature, what is the
>probability that two different chess positions produce the same 64 bit hash
>signature?  That would cause serious problems, because we store search results
>and identify these results by the 64 bit signature, not the full position
>description.  If two different positions have the same signature, we would use
>search results in the wrong places..  very bad.  But this is _very_ rare. Maybe
>1 signature duplication every billion nodes or so, more than acceptably small.
>

	I got the impression that Dann is asking for this issue, using the generally
incorrect term 'collision'.

>The other issue is when you use a 64 bit signature, but your table is less than
>2^64 entries.  So you use some part of the signature.  And now the liklihood of
>a collision is real, because _many_ different 64 bit signatures will match in
>the rightmost N bits for a 2^N entry table.  You store the full 64 bits in the
>entry to be sure you don't get false matches, but at times you have to choose
>between entry A and B.  Which to keep?  There are plenty of rules (you can look
>at hash.c in Crafty for one approach)...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.