Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hash codes

Author: Dan Newman

Date: 02:23:08 11/14/01

Go up one level in this thread


On November 14, 2001 at 04:06:13, Gian-Carlo Pascutto wrote:

>On November 14, 2001 at 02:37:45, Dann Corbit wrote:
>
>>On November 14, 2001 at 02:03:47, Gian-Carlo Pascutto wrote:
>>
>>>On November 13, 2001 at 19:34:44, Dann Corbit wrote:
>>>
>>>>64 bits is generally recognized as a good size for a hash, and 32 bits is
>>>>*definitely* too small.
>>>
>>>I am still eagerly waiting to see someone actually produce
>>>valid proof for this claim.
>>
>>Some programs may have a much bigger problem (e.g. fast searchers).
>>Imagine a program that searches 1M nodes/second.  There are plenty that can
>>manage that, on steamy hardware.
>>In 1000 seconds, there are 1 billion nodes, and in 4000 seconds (a bit over an
>>hour) a 32 bit integer has been exhausted. By this time, it is absolutely sure
>>to have collisions.  Now, imagine a correspondence player, letting the computer
>>think overnight.  Each position will have been randomly overwritten 20 times a
>>day (roughly speaking).
>>
>>I saw a rough mathematical justification somewhere (but I can't remember where
>>it was).
>
>
>As Uri stated, this is not a proof at all 32 bits is not enough.
>
>I also disagree with your reasoning. There is no guarantee that
>if you search 2^32 that you will have a guaranteed collision, unless
>your hash table has 2^32 entries.
>
>Another point is that it does not matter that you have collisions,
>as long as they do not happen at critical points. The structure of
>the search nearly assures this.
>
>I would like to see someone test this, rather than to keep reiterating
>unfounded hearsay.
>
>--
>GCP

A mathematical proof (I suspect) would be rather difficult and involved
and perhaps not even possible, but a demonstration is fairly easy.

I think several of us here have done the experiment.  ISTR that Bob did
this some time ago (perhaps with Cray Blitz) and I vaguely recall that
Vincent (or was it Ed?) also has measured the (high) error rate that you
get with too few bits.  I'm sure others have, too.

When I measured it several years back I got about one collision (error)
per second with a 32-bit hash code.  (I can't remember the probe rate,
but it was probably about 100k/s at the time.)

Anyway, 32 bits seems too small, but I don't really know if the errors
would have any measurable effect on engine strength.  I've always just
assumed that any such error is a bad thing, and 64 bits virtually
eliminates them...

-Dan.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.