Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Hamming distance and lower hash table indexing

Author: Tom Likens
Date: 10:29:45 09/03/03
On September 03, 2003 at 11:10:29, Anthony Cozzie wrote:

>On September 03, 2003 at 10:52:35, Tom Likens wrote:
>
>>On September 03, 2003 at 02:36:33, Tony Werten wrote:
>>
>>>On September 02, 2003 at 13:00:05, Tom Likens wrote:
>>>
>>>>
>>>>This is a general query about an issue I've run into and
>>>>I'm wondering if anyone else has dealt with it or if I'm
>>>>just off base.  Essentially, the issue is this- recently
>>>>I started playing around with my hash table's random
>>>>numbers to see if I could improve them.  Currently, for
>>>>the main table I have a Hamming distance of 24 for
>>>>roughly 800 random 64-bit values.
>>>>
>>>>This is where my inquiry comes in, I use the lower N
>>>>bits of my hash key as an index into the table.  I'm
>>>>wondering, even though the overall Hamming distance is
>>>>24 shouldn't I be concerned about the Hamming distance
>>>>of the lower N bits?
>>>
>>>It could be, but it's unlikely it would give problems.
>>>
>>>To make your keys better, you should not strive for the biggest average hamming
>>>distance, but for the biggest minimum hamming distance.
>>>
>>>22 is doable, wich should give you a collision every never.
>>
>>Tony,
>>
>>I was a little imprecise, my *minimum* Hamming distance is 24,
>>the average is closer to 32.  I don't generate my random numbers
>>on the fly, but instead have a separate program that creates
>>the numbers and saves them to a static array that becomes part of
>>the program proper.
>>
>>Last night I changed this program slightly to give me a minimum
>>Hamming distance of 10 on the lower 32-bits (I tried 12 initially
>>but killed it after four hours of run time, without any results).
>>It also verified that the overall minimum distance for the 64-bit
>>values was still 24.
>>
>>Anyway, long story short, my collision rate in the repetition
>>hash table went down significantly.  I intend to run a final
>>experiment tonight to actually measure the collisions for
>>different 32-bit distances (there has to be a graph in here
>>somewhere ;)
>>
>>regards,
>>--tom
>>
>>>
>>>Of coarse you still have the risk of the lower part being worse than the upper
>>>part, but you can just let your computer search a bit longer for a minimum
>>>hamming distance of 11 in the lower part.
>>>
>>>Tony
>>>
>>>>If these bits are alike, even
>>>>though the overal value is reasonable don't it increase
>>>>the probablity of hash collisions considerably?
>>>>Of course, I won't get a false match since I still use
>>>>all 64-bits of the key to indicate if the hash entry
>>>>is valid, but it's time wasted performing multiple
>>>>probes into the table.
>>>>
>>>>I'm also guessing that this could be more of an issue
>>>>for the repetition hash table, since it is quite a bit
>>>>smaller than the main table.  Currently, I don't do
>>>>multiple probes into this table and I've never seen
>>>>an issue.  Still, I'm starting to wonder if there is
>>>>a problem lurking below the surface that I may
>>>>have missed.
>>>>
>>>>Anyway, I'm probably missing something obvious here.
>>>>I intend to run a number of experiments this evening,
>>>>but I was curious if anyone else has given this much
>>>>thought.
>>>>
>>>>regards,
>>>>--tom
>
>
>rather than using index = key & bitmask, zappa uses index = (key ^ key >> 32) &
>bitmask.  YMMV.
>
>anthony

Did this scheme reduce your number of collisions...

a) significantly
b) some
c) no effect (it just seemed like a cool thing to do ;)

Seriously though, I'll give this a try tonight and let you know
what effect it has on the collisions for my set of numbers.

regards,
--tom
Re: Hamming distance and lower hash table indexing Anthony Cozzie 10:43:26 09/03/03
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.