Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Generalized estimate of hashing collision rate

Author: Bruce Cleaver

Date: 09:30:12 12/11/02

Go up one level in this thread


On December 11, 2002 at 11:29:36, Sune Fischer wrote:

>On December 11, 2002 at 10:05:15, Bruce Cleaver wrote:
>
>>After reading about the subject of hash collisions (what are the
>>
>>chances of a hash collision during a game?) I thought about it and
>>
>>suddenly realized it was similar to the so-called 'Birthday problem'
>>
>>(how many people must be gathered together to have a 50% chance that at
>>
>>least two people share the same birthday?).  The Birthday problem is
>>
>>interesting because people assume the answer is 183 (i.e. 365/2) to get
>>
>>a 50% chance, but in fact the answer is only 23!  With 38 people it
>>
>>climbs to about 90%. 100% chance requires 366 people.  Rather
>>
>>counter-intuitive.
>
>Yep, it is the birthday problem. If you search the archieves you will find
>discussions about this :)
>
>-S.

Rats!  I re-invented the wheel.  At least I got it right :)

After some thought, I now know what the solution is with replacement...if you
assume 16-byte hash nodes, and assume 128 MB of hash (a typical example), then N
can only be as large as 8,388,608 before replacement must occur.  This means
that the chances of a collision would be quite small over any one game.  You
would calculate the equation (above) to obtain the chances over any one game,
and then apply a similar form to calculate the odds over multiple games.  The
chances over any ONE game would be very small, maybe only 1e-04 or so (pure
guess!), but cumulated over several games you could see how many games are
required to get a 50% chance, 90% chance, etc:

1 - (1.0 - 1e-04)^NG  where NG = # of games.

No doubt THAT has been discussed too....



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.