Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hashtables: is larger always better?

Author: Robert Hyatt

Date: 13:32:35 09/24/01

Go up one level in this thread


On September 24, 2001 at 16:24:13, Antonio Dieguez wrote:

>On September 24, 2001 at 16:17:14, Robert Hyatt wrote:
>
>>On September 24, 2001 at 15:41:07, Antonio Dieguez wrote:
>>
>>>On September 24, 2001 at 15:27:35, Robert Hyatt wrote:
>>>
>>>>On September 24, 2001 at 13:53:58, Antonio Dieguez wrote:
>>>>
>>>>>
>>>>>>Several hash into 2 X 32 bit values.  You store one value, you use the other
>>>>>>to generate the hash index.  This is not quite as safe as a true 64 bit hash
>>>>>>signature where all 64 bits are used, but it is pretty good.  If you have
>>>>>>one million entries in the table, your hash key is 52 bits long, effectively,
>>>>>>which is not horrible.  Not as good as 64, but not horrible.
>>>>>
>>>>>hi. isn't one million of entries around 2^20, so just 44 bits are used for the
>>>>>key, (not 52) ?
>>>>
>>>>I am assuming that the hash signature is made up of two 32-bit words.  One of
>>>>the 32 bit words is stored in the hash entry.  The other is used to generate
>>>>the index.  That gives 32 + 20 == 52 bits used if you have a 1M entry table.
>>>
>>>yep, sorry.
>>>
>>>>>what I see is that 48 bits with separate hashindex is already safer than 64 bits
>>>>>without separate index when using just 131072 entries (=47 bits), so I may be
>>>>>not understanding something.
>>>>
>>>>You aren't really using 48 bits.  You are using more.  You are using the number
>>>>of bits you store in an entry + the number of bits you use to produce the table
>>>>index.  In your case 65 (48 + 17).
>>>
>>>I'm hashing just 48 bits so what do I lose? only a few cycles. And what do I
>>>win? that I don't make 100 times more possible a collision if I increase the
>>>hashtable 100 times.
>>
>>Increasing the hash table 100 times does not make the probability of a collision
>>100 times greater.
>
>well, see my other post to Heiner and tell me where I am wrong.
>>
>>64 bits has been proven to do pretty well in today's chess engines.  John
>>Stanback, myself, and a couple of others tested this several years ago in a
>>thread on R.G.C.C, and with 64 bits we didn't have enough collisions to think
>>about, while with 32 bits the collisions were overwhelming.
>
>you said it, "several years ago"

2-3-4 years ago.  But doing searches with _billions_ of nodes to check for
collisions.  We are nowhere near doing searches that big today, even on fast
SMP hardware.  I let at least one test grind on a Cray for weeks, to see how
many collisions I found.  It was so small as to be ignorable.  I think it was
maybe 2-3 collisions for every 8 hours of 1M nps searching or some such thing...



>
>>What you are "losing" if anything, is that it takes you longer to hash 65
>>bits than it does to hash 64.  And if 64 is "good enough" then the wasted time
>>is not needed.  IE you _could_ hash to 128 bits, but the cost would outweigh
>>the gain.
>>
>>>
>>>>Some use 2 x 32, and store one of those and use part of the other for the
>>>>index.  That is _clearly_ better than just using 32, period, with the index
>>>>coming out of the 32 somewhere.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>Also another stupid question, in another post I see calculated the index for
>>>>>hashtable with HV%N, with N the capacity, in that case is it a bit safer to not
>>>>>use an N=2^something? or it is almost the same or there are drawbacks, or I'm
>>>>>not understanding other thing?
>>>>
>>>>If you use %, then you can use any size hash table you want.  I don't use %
>>>>because it generally requires two registers, one for the quotient, one for
>>>>the remainder.  Not to mention integer divide has always been a pretty slow
>>>>operation.  With a perfect power of 2, you can use & (2^N - 1) for the
>>>>address extraction...  with a single-cycle instruction penalty.
>>>
>>>sure, but would be interesting to know if using an N!=2^m fixes a bit the
>>>decreasing security of the hash. I don't have clear how much, that is why I
>>>asked.
>>
>>I'm not sure how.  You are not creating or destroying accuracy.  You are just
>>allowing oddball sized hash tables.  your "effective hash signature size" is
>>just the number of bits you are using overall.  using mod might give you 1/2
>>bit more accuracy, or lose 1/2 bit, depending on what you do...



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.