Author: Sune Fischer
Date: 15:35:06 04/07/04
Go up one level in this thread
On April 07, 2004 at 18:03:24, martin fierz wrote: >On April 07, 2004 at 11:59:58, Sune Fischer wrote: > >>>>It is quite similar from the point on where the hash table is full. >>> >>>absolutely not! >>>the probability that *any* two 64-bit signatures are the same goes from 0...1 as >>>you increase table size and is about 0.5 at 2^32 entries (and goes to 1 very >>>rapidly from there). >> >>I suspect you may in the heat of the moment have thought >>that 2^32 is half of 2^64? > >mmmh, if i thought that they would have thrown me out of university after a >year. so no, i didn't think that, and i don't think that even when tired or >drunk :-) Ok, you never know, that would have explained a lot :) >no, the formula above stems from the fact that if you have a generalized >birthday problem with N days in a year and M pupils, you need approximately M = >sqrt(N) pupils to get a probability of 0.5 that two of them have their birthday >on the same day. which is just applied above. Yes approximately. The true formula is a nasty product of terms with big factorials. >>>the collision probability is given only by the number of independent >>>verification bits that you have remaining in your hash signature. >>> >>>cheers >>> martin >> >>Yes it's not _the_ birthday problem because we are searching for a different >>answer, but it is the same setup. >> >>If you have 363 kids in the class and add another kid chance his birthday will >>collide is very close to 1. > >it's *not* the same setup. because we are *never* anywhere near what you just >described. your example translated to a chess program would mean that you have >close to 2^64 entries in your hashtable, which you don't. instead, you have a >measly 2^20. Yeah but the numbers aren't important. To make the analog more obvious we can pretend we have a hash table size of 2^64 and then ask how many we need to fill to get a collision probability of 50%. This would be identical to the birthday problem. >what is also different is that the collision frequency is a totally different >measure than the birthday problem. i already said that, but it seems i have to >explain: the birthday problem asks: "how likely is it that *any* two pupils will >have their birthday on the same day?" while the hash collision frequency asks >"if i add one more pupil, how likely is he to have birthday on the same day as >any of the pupils i have up to now?". so the birthday problem is the integral >over the collision frequency, very, very different... i hope i explained it >properly this time. Thanks for the explanation, but I was just talking about the model behind it all :) -S. >cheers > martin
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.