Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: hash collisions

Author: Robert Hyatt

Date: 17:09:41 07/09/02

Go up one level in this thread


On July 09, 2002 at 19:37:52, martin fierz wrote:

>On July 09, 2002 at 19:19:45, Robert Hyatt wrote:
>
>>On July 09, 2002 at 18:26:33, martin fierz wrote:
>>
>>>On July 09, 2002 at 17:59:17, Robert Hyatt wrote:
>>>
>>>>I have been running some tests after prompting by Bruce, and the results
>>>>have been interesting.
>>>>
>>>>The question posed by him was "how many hash collisions (signatures match
>>>>but positions do not) can we tolerate in a search without having serious
>>>>problems?"
>>>>
>>>>I did the following:  I made my HashProbe() routine use the node counter,
>>>>and every N nodes, I would "match" a hash entry even if the signatures were
>>>>not the same, simulating a collision.  I got all the way down to a collision
>>>>every 1000 nodes without seeing even a score change at the root, which was
>>>>surprising.
>>>
>>>i'm surprised that you're surprised :-)
>>>i have found and stomped so many bugs in my checkers program's evaluation (and
>>>surely there are lots more in there...) that i wonder how it ever played a
>>>decent game at all. but of course, the answer is that an alpha-beta search masks
>>>such trouble. i guess that as long as your hash collisions don't occur close to
>>>the root, they will have a similar effect as a badly misevaluated position at a
>>>leaf node - none at all, in most cases.
>>>i also would guess that if you try long enough (on a whole set of test
>>>positions), you should find something.
>>>
>>>i also have a question about the experiment you suggest: you are producing
>>>random hash collisions; i.e. let's say a hash collision you get in one search
>>>causes a fail-low, and on the research it disappears again, so you get the
>>>normal eval & pv. i think real hash collisions are deterministic, so you will
>>>always collide with the same positions, so i think your experiment does not
>>>represent a real-world hash collision problem scenario.
>>
>>This is not hard to fix, but remember, I am comparing scores iteration by
>>iteration, not just when the search terminates.  _any_ score change in a
>>single iteration would be noticed, but so far, zero have happened...  It
>>would not be hard to simply take the random "match" that I am using, change
>>it to the current signature so that it is a "real match" and then store it
>>in the table again.
>
>i realized that - i just meant to say that a real collision would be worse than
>what you are doing. of course, if you count such a fail-low & back-to-normal
>event as a problem in your stats, that is ok.
>
>>>i always thought i should do the following experiment one day: reduce the number
>>>of bits in my "hash lock" (the number i compare to see if the position in the
>>>hashtable matches the one i'm looking for), and look when the search starts to
>>>change. that's a one-line change, instead of
>>>if(hashtable[index].lock == hashlock) you just & both sides with something like
>>>1<<N-1. you can check the frequency of your hash collision by checking the whole
>>>lock which should under normal circumstances give close to 0 collisions.
>>>
>>>aloha
>>>  martin
>>
>>
>>I am also testing that way as well.  But that way gives an unknown number of
>>"errors" per second.  My first approach gives me a specific error rate so that
>>I can eventually say "we can stand one error per X nodes, but one error every
>>Y nodes will produce errors significant enough to be noticed."
>>
>>Your way will produce something different...   we can use a signature of X
>>bits without effect, but less than X causes errors...
>>
>>I am not sure _either_ will mean a lot overall, of course. :)
>
>it's true that the second method gives an unknown number of errors per second,
>but of course you can measure it (if you compare all bits of the hash
>signature). all that happens is that the points on the graph you produce are
>spaced unevenly - at least that's what i would expect. but that doesn't seem to
>be a problem to me. i also think the second method answers a more relevant
>question: how many bits do you need in your signature? this question has been
>asked here a couple of times in the past years, and such an experiment gives a
>practical answer. in theory, you need many bits, but in practice i guess even
>lots of collisions will not hurt you much, so you need much less.
>
>BTW, why are you conducting this experiment? i always assumed that a 32-bit hash
>signature was ultra-safe.
>
>aloha
>  martin


It came after a discussion with Bruce about the "lockless hashing" article
by myself and Tim Mann in the current ICGA journal.  The question he asked
was "I see why your approach solves the problem, but is the problem serious
enough to need a solution?"

It was a good question that deserves an answer more detailed than "I believe
so..."



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.