Author: Robert Hyatt
Date: 08:01:58 04/07/04
Go up one level in this thread
On April 07, 2004 at 10:54:57, Sune Fischer wrote: >On April 07, 2004 at 10:37:28, martin fierz wrote: > >>On April 07, 2004 at 10:21:38, Sune Fischer wrote: >> >>>>>> >>>>>>your number is much too high. a good estimate for the hash collision probability >>>>>>is 1/sqrt(hashkey_range); which in your case comes out as 2^-32 or about >>>>>>somewhere around one in a billion... >>>>> >>>>>Not at all. It's way less than that with 64 bits Zobrist. >>>> >>>>i never took the trouble to measure it, but the math is rather straightforward. >>>>so i'd say you have an implementation bug too ;-) >>>> >>>>cheers >>>> martin >>> >>>Not necessarily. If he has a very large hash table with more than 2^32 entries >>>the probability of a collision increases rapidly with 64 bit keys. >> >>you are right of course that vincent should have given the size of his >>hashtable. but: 2^32 = 4 billion, with 12 bytes per entry (?) you are talking >>48GB here. i hardly think his hashtable has that kind of size. then again, with >>his supercomputer stuff, who knows? >>i tacitly assumed that he didn't have that kind of table size. > >I think he might have been using something silly like that on the computer. > >>besides that, it doesn't really matter: you are probably interested in a >>collision frequency, e.g. 1 collision per X lookups. the birthday problem >>doesn't really apply to this: it gives you the probability that *any* two >>hashkeys in your table are identical. this probabiltiy has a different behavior >>than the actual collision rate... > >It is quite similar from the point on where the hash table is full. > >If you have 2^64 different keys and 2^20 of these are already used in the table, >the probability for the next one (assuming it is different) is going to be 2^46, >ie. a frequency of 2^-46. > >While you are still filling the table the probabilities are lower of course due >to "fewer kids in the class". > >Only new keys not already stored in the table can collide. >So to get the real frequency one should not count the true hash hits, and there >will be lots of those too so collision rate is a bit lower. > >Another reducing factor is present if you don't probe/store qsearch, since >collisions on these nodes aren't possible either. > Another issus is how the signatures are created. They are not completely random, they are related, in that sig(plyx) is based on sig(plyx-1) with some bits changed. That is a different assumption than in the birthday paradox... >-S. >>cheers >> martin >> >>
This page took 0.19 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.