Author: Tom Likens
Date: 10:29:45 09/03/03
Go up one level in this thread
On September 03, 2003 at 11:10:29, Anthony Cozzie wrote: >On September 03, 2003 at 10:52:35, Tom Likens wrote: > >>On September 03, 2003 at 02:36:33, Tony Werten wrote: >> >>>On September 02, 2003 at 13:00:05, Tom Likens wrote: >>> >>>> >>>>This is a general query about an issue I've run into and >>>>I'm wondering if anyone else has dealt with it or if I'm >>>>just off base. Essentially, the issue is this- recently >>>>I started playing around with my hash table's random >>>>numbers to see if I could improve them. Currently, for >>>>the main table I have a Hamming distance of 24 for >>>>roughly 800 random 64-bit values. >>>> >>>>This is where my inquiry comes in, I use the lower N >>>>bits of my hash key as an index into the table. I'm >>>>wondering, even though the overall Hamming distance is >>>>24 shouldn't I be concerned about the Hamming distance >>>>of the lower N bits? >>> >>>It could be, but it's unlikely it would give problems. >>> >>>To make your keys better, you should not strive for the biggest average hamming >>>distance, but for the biggest minimum hamming distance. >>> >>>22 is doable, wich should give you a collision every never. >> >>Tony, >> >>I was a little imprecise, my *minimum* Hamming distance is 24, >>the average is closer to 32. I don't generate my random numbers >>on the fly, but instead have a separate program that creates >>the numbers and saves them to a static array that becomes part of >>the program proper. >> >>Last night I changed this program slightly to give me a minimum >>Hamming distance of 10 on the lower 32-bits (I tried 12 initially >>but killed it after four hours of run time, without any results). >>It also verified that the overall minimum distance for the 64-bit >>values was still 24. >> >>Anyway, long story short, my collision rate in the repetition >>hash table went down significantly. I intend to run a final >>experiment tonight to actually measure the collisions for >>different 32-bit distances (there has to be a graph in here >>somewhere ;) >> >>regards, >>--tom >> >>> >>>Of coarse you still have the risk of the lower part being worse than the upper >>>part, but you can just let your computer search a bit longer for a minimum >>>hamming distance of 11 in the lower part. >>> >>>Tony >>> >>>>If these bits are alike, even >>>>though the overal value is reasonable don't it increase >>>>the probablity of hash collisions considerably? >>>>Of course, I won't get a false match since I still use >>>>all 64-bits of the key to indicate if the hash entry >>>>is valid, but it's time wasted performing multiple >>>>probes into the table. >>>> >>>>I'm also guessing that this could be more of an issue >>>>for the repetition hash table, since it is quite a bit >>>>smaller than the main table. Currently, I don't do >>>>multiple probes into this table and I've never seen >>>>an issue. Still, I'm starting to wonder if there is >>>>a problem lurking below the surface that I may >>>>have missed. >>>> >>>>Anyway, I'm probably missing something obvious here. >>>>I intend to run a number of experiments this evening, >>>>but I was curious if anyone else has given this much >>>>thought. >>>> >>>>regards, >>>>--tom > > >rather than using index = key & bitmask, zappa uses index = (key ^ key >> 32) & >bitmask. YMMV. > >anthony Did this scheme reduce your number of collisions... a) significantly b) some c) no effect (it just seemed like a cool thing to do ;) Seriously though, I'll give this a try tonight and let you know what effect it has on the collisions for my set of numbers. regards, --tom
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.