Author: Frank Phillips
Date: 10:28:20 09/03/03
Go up one level in this thread
On September 03, 2003 at 10:52:35, Tom Likens wrote: >On September 03, 2003 at 02:36:33, Tony Werten wrote: > >>On September 02, 2003 at 13:00:05, Tom Likens wrote: >> >>> >>>This is a general query about an issue I've run into and >>>I'm wondering if anyone else has dealt with it or if I'm >>>just off base. Essentially, the issue is this- recently >>>I started playing around with my hash table's random >>>numbers to see if I could improve them. Currently, for >>>the main table I have a Hamming distance of 24 for >>>roughly 800 random 64-bit values. >>> >>>This is where my inquiry comes in, I use the lower N >>>bits of my hash key as an index into the table. I'm >>>wondering, even though the overall Hamming distance is >>>24 shouldn't I be concerned about the Hamming distance >>>of the lower N bits? >> >>It could be, but it's unlikely it would give problems. >> >>To make your keys better, you should not strive for the biggest average hamming >>distance, but for the biggest minimum hamming distance. >> >>22 is doable, wich should give you a collision every never. > >Tony, > >I was a little imprecise, my *minimum* Hamming distance is 24, >the average is closer to 32. I don't generate my random numbers >on the fly, but instead have a separate program that creates >the numbers and saves them to a static array that becomes part of >the program proper. > >Last night I changed this program slightly to give me a minimum >Hamming distance of 10 on the lower 32-bits (I tried 12 initially >but killed it after four hours of run time, without any results). >It also verified that the overall minimum distance for the 64-bit >values was still 24. > >Anyway, long story short, my collision rate in the repetition >hash table went down significantly. I intend to run a final >experiment tonight to actually measure the collisions for >different 32-bit distances (there has to be a graph in here >somewhere ;) > >regards, >--tom > >> >>Of coarse you still have the risk of the lower part being worse than the upper >>part, but you can just let your computer search a bit longer for a minimum >>hamming distance of 11 in the lower part. >> >>Tony >> >>>If these bits are alike, even >>>though the overal value is reasonable don't it increase >>>the probablity of hash collisions considerably? >>>Of course, I won't get a false match since I still use >>>all 64-bits of the key to indicate if the hash entry >>>is valid, but it's time wasted performing multiple >>>probes into the table. >>> >>>I'm also guessing that this could be more of an issue >>>for the repetition hash table, since it is quite a bit >>>smaller than the main table. Currently, I don't do >>>multiple probes into this table and I've never seen >>>an issue. Still, I'm starting to wonder if there is >>>a problem lurking below the surface that I may >>>have missed. >>> >>>Anyway, I'm probably missing something obvious here. >>>I intend to run a number of experiments this evening, >>>but I was curious if anyone else has given this much >>>thought. >>> >>>regards, >>>--tom Do you have a link to the code (in C) to calculate the hamming distance? I searched a bit, but found nothing. (I used the mersene twister to calcuate 32 bit random numbers that I concatonate to 64 bit. It would be nice to know how 'good' they are.). Frank
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.