Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hash collisions (was Re: Let's analyze move 36)

Author: Robert Hyatt

Date: 18:08:43 10/08/97

Go up one level in this thread


On October 08, 1997 at 15:20:38, Dan Homan wrote:

>On October 08, 1997 at 13:49:14, Chris Whittington wrote:
>
>
>>
>>Ok let's try my maths model. tell me if it predicts the results.
>>
>>Node rate N nodes per sec
>>Hash lookups N/4 hits per sec
>>
>>Chance of hash collision is related to number of bits in hash signature,
>>suppose we have n bits. Collision chance per pair of hash hash lookups =
>>1 / 2^n
>>
>>The hash collision rate is unfortunateyl going to be proportional to
>>(roughly) the SQUARE of the hash lookup rate.
>>
>>Total collisions per second = 1 / 2^n  x N/4 x N/4
>>
>> = N^2 / 2^n / 16
>>
>>ideally we could go for, say, only one hash collision per week,
>>
>>Someone with a calculator and some spare time ....
>>
>>How big  a hash bit count would we need for a 10,000, a 100,000 and a
>>2,000,000 nps program ?
>>
>>Alternatively and additionally, for a 64 bit sized hash index, what
>>collision rate will a 10,000, a 100,000 and a 2,000,000 nps program give
>>?
>>
>>Does this agree with Bob's empirical results ?
>>
>>Chris
>
>Shouldn't this also depend on the size of the table?  Including table
>size, my calculation is a little different...
>
>P = probability of a collision on a given table lookup
>T = total number of table entries
>p = probability of two positions having the same hash
>    signature ( = 1/2^n , as chris stated, with n = number of bits in
>hash)
>N = nodes searched per second (assume one table lookup at each node)
>R = rate of collisions
>
>So
>
>P = 1 - (1-p)^T  (This is just 1 minus the probablity of no collision)
>
>if (1/p) >> T  the P = Tp (approximately)
>
>The rate R is then given by....
>
>R = NP
>
>for a 256K entry table, 32 bit hash code, 30K nodes / sec
>  (i.e.  T = 256000, n = 32, N = 30000/sec )
>
> R = 1.78 collisions/sec
>
>for a 48 bit hash code: R ~ 3E-5 collisions/sec
>for a 64 bit hash code: R ~ 4E-10 collisions/sec
>
>T should be the number of hash table entries that are filled, so the
>rate
>of collisions is less than this for a partially filled table.
>
> - Dan
>
>P.S.  This was done quick, so somebody please check my math.
>
>---------------------------------------------------------------------
>Dan Homan                               dch@vlbi.astro.brandeis.edu
>Physics Department                      dch@quasar.astro.brandeis.edu
>Brandeis University                     homan@binah.cc.brandeis.edu
>---------------------------------------------------------------------
>       http://quasar.astro.brandeis.edu/BRAG/people/dch.html
>---------------------------------------------------------------------

That is damned good.  Because with 32 bit hash keys, 1 per second is not
something that strikes me as being far off from what I was seeing when
Stanback and I were discussing this a while back.  Need to get him here
too as he might recall his data better.  But this seems to be maybe
right
on the mark for 30K nps and 32 bits...  and for 64 bits, because I was
only seeing one every now and then when I tested that...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.