Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hamming distance and lower hash table indexing

Author: Tom Likens

Date: 07:52:35 09/03/03

Go up one level in this thread


On September 03, 2003 at 02:36:33, Tony Werten wrote:

>On September 02, 2003 at 13:00:05, Tom Likens wrote:
>
>>
>>This is a general query about an issue I've run into and
>>I'm wondering if anyone else has dealt with it or if I'm
>>just off base.  Essentially, the issue is this- recently
>>I started playing around with my hash table's random
>>numbers to see if I could improve them.  Currently, for
>>the main table I have a Hamming distance of 24 for
>>roughly 800 random 64-bit values.
>>
>>This is where my inquiry comes in, I use the lower N
>>bits of my hash key as an index into the table.  I'm
>>wondering, even though the overall Hamming distance is
>>24 shouldn't I be concerned about the Hamming distance
>>of the lower N bits?
>
>It could be, but it's unlikely it would give problems.
>
>To make your keys better, you should not strive for the biggest average hamming
>distance, but for the biggest minimum hamming distance.
>
>22 is doable, wich should give you a collision every never.

Tony,

I was a little imprecise, my *minimum* Hamming distance is 24,
the average is closer to 32.  I don't generate my random numbers
on the fly, but instead have a separate program that creates
the numbers and saves them to a static array that becomes part of
the program proper.

Last night I changed this program slightly to give me a minimum
Hamming distance of 10 on the lower 32-bits (I tried 12 initially
but killed it after four hours of run time, without any results).
It also verified that the overall minimum distance for the 64-bit
values was still 24.

Anyway, long story short, my collision rate in the repetition
hash table went down significantly.  I intend to run a final
experiment tonight to actually measure the collisions for
different 32-bit distances (there has to be a graph in here
somewhere ;)

regards,
--tom

>
>Of coarse you still have the risk of the lower part being worse than the upper
>part, but you can just let your computer search a bit longer for a minimum
>hamming distance of 11 in the lower part.
>
>Tony
>
>>If these bits are alike, even
>>though the overal value is reasonable don't it increase
>>the probablity of hash collisions considerably?
>>Of course, I won't get a false match since I still use
>>all 64-bits of the key to indicate if the hash entry
>>is valid, but it's time wasted performing multiple
>>probes into the table.
>>
>>I'm also guessing that this could be more of an issue
>>for the repetition hash table, since it is quite a bit
>>smaller than the main table.  Currently, I don't do
>>multiple probes into this table and I've never seen
>>an issue.  Still, I'm starting to wonder if there is
>>a problem lurking below the surface that I may
>>have missed.
>>
>>Anyway, I'm probably missing something obvious here.
>>I intend to run a number of experiments this evening,
>>but I was curious if anyone else has given this much
>>thought.
>>
>>regards,
>>--tom



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.