Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Hash table allocation

Author: Ricardo Gibert

Date: 18:14:40 08/26/03

On August 26, 2003 at 19:33:36, Dieter Buerssner wrote:

>On August 26, 2003 at 19:05:44, Ricardo Gibert wrote:
>
>>In general, this technique is sensitive to the relative sizes of the truncated
>>hash key and the number of entries.
>
>Yes.
>
>>If they are too close, some of the indices
>>could occur with as much as twice the frequency of some others.
>
>Yes. And only, if they come close. The method is a bit tailored to 32 bit
>adressing. Any hash_entry will use at least (say) 12 bytes. The maximum
>"ununiformity" will then be (when you really use the maximum of 4 Gb of HTs,
>that are possible) something like 1 compared to 1+1/12 as the relative
>probabilites. I use always 3 hash entries in a row. So this makes the
>ununiformicity even in the extreme case already lower. I cannot see, how it
>could have a bad influence on the search, when one index is principially chosen
>with a 3% higher probability than another one.
>
>>In this case,
>>using modulo divsion would work better.
>
>No. Modulo has the same problem. This is, how this thread started. Instead of a
>32 bit number, Peter used a power of 2 just larger than the actual number of
>hash entries. And he gave an example, why the indexes are not distributed
>equally in that case (with the modulo method). In such an extreme case, the
>probability for one index will be double than that of another index.


What I meant was the modulo method applied to the 64 bit hash key rather than
the truncated hash key. The degree of overuse is miniscule. Significant problems
do not arise until the modulus approaches the billions of billions.

The multiplication method won't allow you to use the full 64 bits. You have to
throw away enough bits to accommodate the result after the multiplication. This
is why the modulo method is better when the modulus gets really big.

BTW, in the above I have also assumed the hash key has a power of 2 possible
values and the modulus is relatively prime to that power of 2. Any odd number is
relatively prime to a power of 2.


>
>BTW. With some dirty pointer tricks, one can even optimize the multiplication
>method more.

I suggested the multiplication method to Diepeveen awhile back and stated my
expectation that the compiler would make the obvious optimizations. Maybe I was
mistaken.

>
>unsigned long number_of_entries_times_size = numbers_of_entries *
>sizeof(HASH_ENTRY); /* Sometime at initialization time */
>
>unsigned long idx = ((unsigned long long)thirtytwo_bits_of_hashkey *
>number_of_entries_times_size)>>32;
>
>HASH_ENTRY *hp = (HASH_ENTRY *)((unsigned char *)hash_table+idx);
>
>Thereby the need for the internal calculation of the memory location from the
>index, which in general will mean a multiplication, in practice perhaps a shift
>or a lea, will be avoided.
>
>Regards,
>Dieter

Re: Hash table allocation Dieter Buerssner 12:34:18 08/27/03
- Re: Hash table allocation Ricardo Gibert 14:28:23 08/27/03
  - Re: Hash table allocation Dieter Buerssner 14:43:59 08/27/03

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.