Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hashtables: is larger always better?

Author: Antonio Dieguez

Date: 13:24:13 09/24/01

Go up one level in this thread


On September 24, 2001 at 16:17:14, Robert Hyatt wrote:

>On September 24, 2001 at 15:41:07, Antonio Dieguez wrote:
>
>>On September 24, 2001 at 15:27:35, Robert Hyatt wrote:
>>
>>>On September 24, 2001 at 13:53:58, Antonio Dieguez wrote:
>>>
>>>>
>>>>>Several hash into 2 X 32 bit values.  You store one value, you use the other
>>>>>to generate the hash index.  This is not quite as safe as a true 64 bit hash
>>>>>signature where all 64 bits are used, but it is pretty good.  If you have
>>>>>one million entries in the table, your hash key is 52 bits long, effectively,
>>>>>which is not horrible.  Not as good as 64, but not horrible.
>>>>
>>>>hi. isn't one million of entries around 2^20, so just 44 bits are used for the
>>>>key, (not 52) ?
>>>
>>>I am assuming that the hash signature is made up of two 32-bit words.  One of
>>>the 32 bit words is stored in the hash entry.  The other is used to generate
>>>the index.  That gives 32 + 20 == 52 bits used if you have a 1M entry table.
>>
>>yep, sorry.
>>
>>>>what I see is that 48 bits with separate hashindex is already safer than 64 bits
>>>>without separate index when using just 131072 entries (=47 bits), so I may be
>>>>not understanding something.
>>>
>>>You aren't really using 48 bits.  You are using more.  You are using the number
>>>of bits you store in an entry + the number of bits you use to produce the table
>>>index.  In your case 65 (48 + 17).
>>
>>I'm hashing just 48 bits so what do I lose? only a few cycles. And what do I
>>win? that I don't make 100 times more possible a collision if I increase the
>>hashtable 100 times.
>
>Increasing the hash table 100 times does not make the probability of a collision
>100 times greater.

well, see my other post to Heiner and tell me where I am wrong.
>
>64 bits has been proven to do pretty well in today's chess engines.  John
>Stanback, myself, and a couple of others tested this several years ago in a
>thread on R.G.C.C, and with 64 bits we didn't have enough collisions to think
>about, while with 32 bits the collisions were overwhelming.

you said it, "several years ago"

>What you are "losing" if anything, is that it takes you longer to hash 65
>bits than it does to hash 64.  And if 64 is "good enough" then the wasted time
>is not needed.  IE you _could_ hash to 128 bits, but the cost would outweigh
>the gain.
>
>>
>>>Some use 2 x 32, and store one of those and use part of the other for the
>>>index.  That is _clearly_ better than just using 32, period, with the index
>>>coming out of the 32 somewhere.
>>>
>>>
>>>
>>>
>>>
>>>
>>>>
>>>>Also another stupid question, in another post I see calculated the index for
>>>>hashtable with HV%N, with N the capacity, in that case is it a bit safer to not
>>>>use an N=2^something? or it is almost the same or there are drawbacks, or I'm
>>>>not understanding other thing?
>>>
>>>If you use %, then you can use any size hash table you want.  I don't use %
>>>because it generally requires two registers, one for the quotient, one for
>>>the remainder.  Not to mention integer divide has always been a pretty slow
>>>operation.  With a perfect power of 2, you can use & (2^N - 1) for the
>>>address extraction...  with a single-cycle instruction penalty.
>>
>>sure, but would be interesting to know if using an N!=2^m fixes a bit the
>>decreasing security of the hash. I don't have clear how much, that is why I
>>asked.
>
>I'm not sure how.  You are not creating or destroying accuracy.  You are just
>allowing oddball sized hash tables.  your "effective hash signature size" is
>just the number of bits you are using overall.  using mod might give you 1/2
>bit more accuracy, or lose 1/2 bit, depending on what you do...



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.