Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: About debugging HashTables implementation .

Author: Will Singleton

Date: 21:36:07 04/12/02

Go up one level in this thread


On April 12, 2002 at 17:24:43, Robert Hyatt wrote:

>On April 12, 2002 at 15:44:46, Will Singleton wrote:
>
>>On April 12, 2002 at 10:43:42, Pat King wrote:
>>
>>>On April 10, 2002 at 20:47:06, Will Singleton wrote:
>>>
>>>>On April 10, 2002 at 17:31:20, Andrew Williams wrote:
>>>>
>>>>>On April 10, 2002 at 15:28:26, Pat King wrote:
>>>>>
>>>>>>On April 09, 2002 at 12:11:08, Andrew Williams wrote:
>>>>>>
>>>>>>>On April 09, 2002 at 11:56:45, Pat King wrote:
>>>>>>>
>>>>>>>>On April 08, 2002 at 15:17:46, Robert Hyatt wrote:
>>>>>>>>
>>>>>>>>>How do you define "collision"?  Normally this means two different board
>>>>>>>>>positions produce the same hash signature.  If you are getting _those_ you
>>>>>>>>>have a Zobrist hashing bug.
>>>>>>>>
>>>>>>>>How do you detect collisions, then? Record the entire position in the hash
>>>>>>>>table? Even for a debug build, that seems like it would be impractical. >>Although undetected hash bugs are even less practical :)
>>>>>>>>
>>>>>>>
>>>>>>>You check to see if the move stored in the hash entry is legal or not. If
>>>>>>>the hash keys match but the move isn't legal in the position, it means that
>>>>>>>you have a hash collision.
>>>>>>>
>>>>>>>Andrew
>>>>>>
>>>>>>But you rely on the hashed positions being different enough for that move to be
>>>>>>illegal. Certainly, your technique detects collisions, but not all collisions!?
>>>>>>
>>>>>>Pat
>>>>>
>>>>>Yes. I should have been clearer. I don't have any code in PM to detect
>>>>>collisions. As you say, what I described wouldn't catch all collisions.
>>>>>
>>>>>Andrew
>>>>
>>>>As always seem to be the case in computer chess, there are competing definitions
>>>>for the same terms (collisions of collisions, if you will).
>>>>
>>>>In the Dec 1996 issue of ICCAJ, Beal and Smith discuss collisions in their
>>>>article on multiple-probe hashing.  Your definition of a collision, if I read
>>>>you correctly, is what Beal & Smith would call a false-match error (negligible,
>>>>in their opinion).  Collisions, otoh, they say occur when two positions gen the
>>>>same hash >index<, but different keys.  This, they say, can occur frequently,
>>>>and requires a replacement scheme to determine which is kept and which
>>>>discarded.
>>>
>>>The definition of collision being used in this thread was clearly spelled out by
>>>Bob, and lurks above in all the re:s above.
>>>It is especially relevant to this thread, where we humble beginners (please
>>>don't notice how long I've been in that category) struggle to get even the
>>>basics right.
>>
>>
>>What Bob Said was this:
>>"How do you define "collision"?  Normally this means two different board
>>positions produce the same hash signature.  If you are getting _those_ you
>>have a Zobrist hashing bug.  If you mean two different signatures hash to the
>>same board position then perhaps your 64 bit random numbers are very poorly
>>chosen..."
>>
>>I think what he meant by the second sentence was: if two different keys hash to
>>the same spot in the table, then your numbers aren't random enough.
>>
>>This would follow the definition of collision that has been used in the articles
>>I have read.
>
>That is sometimes called a "collision" but it is incorrectly done.  A collision
>is technically "two unique positions that produce the same hash signature"  In
>normal hashing (not chess) you take something (say a social security number)
>and hash it to something used to address a table for random access.  In chess,
>we hash a position to a signature, then we hash the signature (usually by
>ANDing with a mask) to reduce it to a table index.
>
>This second kind of collision (two hash signatures yield the same table index
>when ANDed with some mask) must be handled as you mention.  It boils down to
>an effective replacement policy.  The first kind of hashing collision can not
>be tolerated, and with 64 bit keys should not happen if the Zobrist random
>numbers are chosen very well...
>

>> But it seems clear, to me, that from my testing, collisions are
>>common and must be managed by an appropriate replacement scheme.  So I don't
>>know why Bob and others seem to adhere to the idea that collisions only occur
>>rarely, or if you have a bug.
>>
>
>Mainly because when we use the term "collision" we mean two unique chess
>positions that produce the same hash signature.  And that is both a problem
>that will cause trouble, and based on lots of 64 bit testing, is probably also
>a bug.
>
>


Right, in my recent testing, I never get those type of collisions.


>
>>
>>If a collision is two different board positions producing the same full 64-bit
>>hash key, then what do you call the event in which two different board positions
>>produce different keys, but map to the same hash slot?
>
>
>As I initially said, in "normal" hashing, the "signature" step is not used
>directly.  You hash something to an index and go there.  We are doing it in
>two steps and _either_ can be called a collision.  However, in my case, I
>don't call two different signatures yielding the same table index a collision,
>I call that something that just happens naturally.  It isn't a bug at all
>assuming the table index distribution seems to be pretty uniform.

OK.  I guess we'll call it a "fender-bender."


>
>One common error I used to see was people building zobrist random numbers by
>jamming two 32 bit floats together.  Unfortunately, that didn't produce a 64
>bit random number because 9 of the bits (sign + exponent) are not very random.
>That led to "clustering of indices" with many "gaps".  Ideally you'd like your
>hash signatures to distribute addresses all over the table in a very uniform
>way.  And generally good Zobrist numbers will do this effectively...
>

Yes. When I tried the shift/or with two 32-bit ints, I got a significant
clustering of numbers, which could be plotted and seen easily with Excel.  My
current scheme gives a good distribution, and I'm happy with it.

btw, thanks for addressing this.  I think that people toss around chess terms
that are not well-defined, with resulting confusion.  I still think that the
collision term is best defined by Beal, as I have indicated.  Two hashkeys
competing for the same table slot, or index, colliding.  As you define it, is
really a bug, and should not occur; hence the term is not meaningful.

Will



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.