Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hash collisions and a little maths

Author: Vincent Diepeveen

Date: 18:06:53 02/18/99

Go up one level in this thread


On February 18, 1999 at 06:32:12, Rémi Coulom wrote:

>I just calculated the probability to have two identical values in a list of p
>values chosen at random among q possible values. Maybe everybody knew this
>already, but I was ignorant of the result and I find it is interesting. The
>result is, for large values of q and p < q, equivalent to exp(-(p*p)/(2*q)). It
>means that the probability of collision becomes high as soon as p is close to
>the square root of q.

Cool that there is someone who can fiddle with math.

Can you change all this to a function that
gives the estimated number of hash collisions with inputs:

  - #bits
  - size of hashtable
  - loading factor of hashtable

Many thanks,
Vincent

>A consequence of this is that if you take 1,000 people, you have a very high
>probability that two of them have the exact same number of hair on their heads,
>assuming that the order of magnitude of this number is 100,000.
>
>A less funny consequence for computer chess programmers is that if you use 2^64
>possible values for hash codes, you have a 1% probability of collision for 600 M
>values chosen at random. Of course, hash codes in a search tree are not
>independant and identically distributed.
>
>More experienced programmer maybe knew this since long, but I did not after 4
>years of using a hash table and I thought some of you might be interested.
>
>By the way, I wonder how big were the hash codes in Deep Blue.
>
>Remi



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.