Author: Rémi Coulom
Date: 03:32:12 02/18/99
I just calculated the probability to have two identical values in a list of p values chosen at random among q possible values. Maybe everybody knew this already, but I was ignorant of the result and I find it is interesting. The result is, for large values of q and p < q, equivalent to exp(-(p*p)/(2*q)). It means that the probability of collision becomes high as soon as p is close to the square root of q. A consequence of this is that if you take 1,000 people, you have a very high probability that two of them have the exact same number of hair on their heads, assuming that the order of magnitude of this number is 100,000. A less funny consequence for computer chess programmers is that if you use 2^64 possible values for hash codes, you have a 1% probability of collision for 600 M values chosen at random. Of course, hash codes in a search tree are not independant and identically distributed. More experienced programmer maybe knew this since long, but I did not after 4 years of using a hash table and I thought some of you might be interested. By the way, I wonder how big were the hash codes in Deep Blue. Remi
This page took 0.02 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.