Author: martin fierz
Date: 08:20:18 04/07/04
Go up one level in this thread
On April 07, 2004 at 10:54:57, Sune Fischer wrote: >On April 07, 2004 at 10:37:28, martin fierz wrote: > >>On April 07, 2004 at 10:21:38, Sune Fischer wrote: >> >>>>>> >>>>>>your number is much too high. a good estimate for the hash collision probability >>>>>>is 1/sqrt(hashkey_range); which in your case comes out as 2^-32 or about >>>>>>somewhere around one in a billion... >>>>> >>>>>Not at all. It's way less than that with 64 bits Zobrist. >>>> >>>>i never took the trouble to measure it, but the math is rather straightforward. >>>>so i'd say you have an implementation bug too ;-) >>>> >>>>cheers >>>> martin >>> >>>Not necessarily. If he has a very large hash table with more than 2^32 entries >>>the probability of a collision increases rapidly with 64 bit keys. >> >>you are right of course that vincent should have given the size of his >>hashtable. but: 2^32 = 4 billion, with 12 bytes per entry (?) you are talking >>48GB here. i hardly think his hashtable has that kind of size. then again, with >>his supercomputer stuff, who knows? >>i tacitly assumed that he didn't have that kind of table size. > >I think he might have been using something silly like that on the computer. > >>besides that, it doesn't really matter: you are probably interested in a >>collision frequency, e.g. 1 collision per X lookups. the birthday problem >>doesn't really apply to this: it gives you the probability that *any* two >>hashkeys in your table are identical. this probabiltiy has a different behavior >>than the actual collision rate... > >It is quite similar from the point on where the hash table is full. absolutely not! the probability that *any* two 64-bit signatures are the same goes from 0...1 as you increase table size and is about 0.5 at 2^32 entries (and goes to 1 very rapidly from there). the collision probability is given only by the number of independent verification bits that you have remaining in your hash signature. cheers martin > >If you have 2^64 different keys and 2^20 of these are already used in the table, >the probability for the next one (assuming it is different) is going to be 2^46, >ie. a frequency of 2^-46. > >While you are still filling the table the probabilities are lower of course due >to "fewer kids in the class". > >Only new keys not already stored in the table can collide. >So to get the real frequency one should not count the true hash hits, and there >will be lots of those too so collision rate is a bit lower. > >Another reducing factor is present if you don't probe/store qsearch, since >collisions on these nodes aren't possible either. > >-S. >>cheers >> martin >> >>
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.