Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Hashtables: is larger always better?

Author: Uri Blass
Date: 12:03:57 09/24/01
On September 24, 2001 at 11:00:40, Antonio Dieguez wrote:

>On September 24, 2001 at 00:35:24, Uri Blass wrote:
>
>>On September 23, 2001 at 20:35:39, Heiner Marxen wrote:
>>
>>>On September 23, 2001 at 16:45:08, Antonio Dieguez wrote:
>>>
>>>>On September 23, 2001 at 16:00:26, Uri Blass wrote:
>>>>
>>>>>On September 23, 2001 at 14:17:18, Antonio Dieguez wrote:
>>>>>
>>>>>>On September 23, 2001 at 02:50:59, Uri Blass wrote:
>>>>>>
>>>>>>>On September 22, 2001 at 23:42:29, James Swafford wrote:
>>>>>>>
>>>>>>>>On September 22, 2001 at 22:17:06, Uri Blass wrote:
>>>>>>>>
>>>>>>>>>On September 22, 2001 at 19:08:06, Torstein Hall wrote:
>>>>>>>>>
>>>>>>>>>>On September 22, 2001 at 18:29:46, Andreas De Troy wrote:
>>>>>>>>>>
>>>>>>>>
>>>>>>>>>If the hash tables are very big then the probability for hash collision can
>>>>>>>>>increase and if there are enough hash collisions the result can be a bad move.
>>>>>>>>>
>>>>>>>>>Uri
>>>>>>>>
>>>>>>>>Why do you think the size of the table has any bearing on the number
>>>>>>>>of collisions?  The number of collisions is a function of the
>>>>>>>>"uniqueness" of your key, not how many entries are in your table.
>>>>>>>
>>>>>>>let take an extreme case:
>>>>>>>suppose there is only one entry in your hash tables and you use 64 bit numbers.
>>>>>>>
>>>>>>>The probability for wrong hash collision is 1/2^64 for every new node after the
>>>>>>>first node.
>>>>>>>
>>>>>>>If you have 1000000 entries in your hash tables and the hash tables are full
>>>>>>>then the probability for hash collision is 1000000/2^64 for every new node.
>>>>>>
>>>>>>Hi.
>>>>>>
>>>>>>Just one thing, not everybody use part of the key for the hash index, me not for
>>>>>>example but I know a lot do.
>>>>>
>>>>>How do you use hash tables in this case
>>>>>You have a position and you need to decide if it is in the hash tables.
>>>>>
>>>>>I assume that the first thing that you do is compressing the position to a 64
>>>>>bit number
>>>>>
>>>>>Am I right?
>>>>>
>>>>>In this case you need to decide if the 64 bit number is in your hash tables.
>>>>>How do you do it in a fast way?
>>>>
>>>>I don't know how much slower it is, but I maintain another number also for the
>>>>index, the bigger the hashtable the bigger it is. I'm a bit surprised by your
>>>>question because it is even more natural doing this way, and the other way is
>>>>the one that looks like a trick.
>>>
>>>Huh, now *I* am a bit surprised.
>>>
>>>The natural/normal way to proceed is:
>>>(1) Map the complete board into a single hash value (HV).
>>>    The HV usually is 64 bit wide.
>>>    With the Zobrist hash algorithm this value can be maintained incrementally,
>>>    and need not be recomputed from scratch.
>>>(2) From the HV compute an index into the hash table (HI = hash index).
>>>    Basically a hash table is an array of some fixed size, say N.
>>>    HI is a value between 0 and HI-1.  Mostly we do:  HI = HV % N;
>>>    If N is a power of two, the mod operation can be replaced by masking
>>>    log2(N) low bits from HV:  HI = HV & mask;
>>>(3) Try to find the searched for info in hash_table[HI], first.
>>>(4) If that hash table slot is filled with other data, we may decide to
>>>    search some more slots.  Many different ways to do this.
>>>
>>>Mostly, the HV is stored into the hash table entries.
>>>Since part of the HV has already been used to select the entry, one may
>>>try to reduce the stored value to the not yet used part of HV.
>>>
>>>>>If you have no way to calculate the hash index from the key then I do not see
>>>>>a fast way to do it.
>>>>>
>>>>>I understand that the hash tables is a list of 64 bit numbers
>>>
>>>To Uri:
>>>No, the most important aspect of a hash table is the direct addressing
>>>of its elements.  A "list" is something else.
>>
>>I tried to understand how Antonio does his hash tables when it is impossible
>>to calculate the index directly from the 64 bit number(in his case he said that
>>he used 48 bit number for hash in another post so I am going to use 48 for the
>>rest of this post).
>>
>>He explained in his reply that he also gets the index from the position.
>>
>>I am not sure if I understood him correctly but
>>maybe he means that he does not compress chess positions to 48 bit numbers but
>>to  48+x bits numbers when 2^x is the size of the hash tables and in this
>>case the probability for hash collision in a new node is at most 1/2^48 but in
>>this case I can say that he is using 48+x bits and not 48 bits for hash tables.
>
>you can say whatever you want unless it is not ofensive, since you are free to
>do so :), but what would you mean with that?
>the rest of the bits(the other number) is only to get the index, and only 48
>bits are hashed-wasted-saved-stored-etc. so I would say it is exactly 48 bits
>for hash tables.
>
>be well.

I am not going to discuss about definitions.
I understand that in your case the probability for hash collision is 1/(2^48)
for every new node and you basically compress positions to 48+x bit numbers when
2^x is the size of the hash tables.

Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.