Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hash Collision

Author: Robert Hyatt

Date: 08:25:44 12/07/02

Go up one level in this thread


On December 07, 2002 at 09:28:55, David Rasmussen wrote:

>On December 07, 2002 at 08:01:46, Brian Richardson wrote:
>
>>
>>There are two situations that are sometimes called hash collisions.
>>One is when just the first n bits of the hash index or key are the same.
>>This happens quite a lot, and I don't personally consider it a collision.
>>The other is when all bits of the hash signature are the same.
>
>In both of these cases, the term "collision" is correct, but refers to different
>things.
>
>The first one is general for the concept of hash tables. If your hashing
>function (in this case a mod or and of the zobrist signature) maps to different
>entries to the same slot, you have a "collision" (hashing collision), and in
>many applications you simply have a linked list or some other container, of the
>values with this hash value. In most chess programs, we replace by some
>replacement scheme.
>
>The second one is special for our purpose. Using a zobrist scheme for position
>signatures, we hope to obtain unique signatures for unique positions, at least
>as far as the search will take us in the lifetime of a hash entry. That is, some
>position in the opening might have the same signature as one in the late
>endgame, that doesn't matter, but we hope that no two positions that are "near"
>eachother have the same signature. Because we can't tell the difference between
>them in any other way than by looking at the signature. If it happens, we have a
>"collision" (signature collision). This is what happened here. And these are
>what I am asking about. How many of these do people experience in the course of
>1000000 games?
>
>Of course we can't be sure that only the ones caught by pseudo legality are
>there. There could be two different positions with the same signature and in
>which the stored move is pseudo legal. If so, we just hope that the score or
>bound stored in the hash table isn't too far off compared to the real value for
>this position. If it is, it could end up influencing the PV, and even the first
>move of the PV. We never know, unless we do some testing where we store the
>entire position with the hash entry.
>
>>I have never seen this with 64 bit values.
>
>I hadn't either until now. But I am pretty sure that Bob sais something about
>this happening once in 100 games or so.

That was an estimate.  Think about the math.  You are collapsing some 2^168
positions into 2^64 hash space.  That is a _huge_ aliasing problem.  So the
probability of getting duplicates is not just a probability, it is an absolute
certainty...


>
>>I would not exit in either case, but just continue searching.
>
>I wouldn't either if it was important. In this case, it hadn't bothered me
>before, and I thought it was nice that it happened, because otherwise I wouldn't
>have known. If I just wrote it to a log etc. I would have stopped looking for it
>months ago, since it hasn't happened in nearly 2 years, when I started this
>rewrite.
>
>/David



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.