Author: Bruce Cleaver
Date: 07:05:15 12/11/02
After reading about the subject of hash collisions (what are the chances of a hash collision during a game?) I thought about it and suddenly realized it was similar to the so-called 'Birthday problem' (how many people must be gathered together to have a 50% chance that at least two people share the same birthday?). The Birthday problem is interesting because people assume the answer is 183 (i.e. 365/2) to get a 50% chance, but in fact the answer is only 23! With 38 people it climbs to about 90%. 100% chance requires 366 people. Rather counter-intuitive. At any rate, the hash-collision problem is similar in form. Assumptions are: 1) hash signature is 64 bits. 2) hashing function is well written so that it is uniformly distributed over the space of 2^64. 3) All hash nodes are kept and not replaced. I realize this does not accord with actual practice, but cut me some slack here. Gotta walk before running. Maybe I can ginn up an answer with replacement later. The generalized form for a hashing collision after N nodes (all kept) is: 1 - [((2^64 - 1)/2^64)*((2^64-2)/2^64))*(...)*((2^64-N)/2^64)] 'Simply' pick N, loop thru N times and see what the answer is. IF you wish to see what the value of N is for 50%, that will require a certain number (probably on the order of 10^11 nodes). For a 90% chance, the number N must be bigger, etc. Now comes a problem - this probably requires extended-accuracy arithmetic (128 bits or more) to perform correctly. Can anybody here help out? Bruce
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.