Author: Dan Newman
Date: 02:58:17 02/11/01
Go up one level in this thread
On February 10, 2001 at 10:13:50, Pat King wrote: >On February 10, 2001 at 02:34:35, Dan Newman wrote: > >>On February 09, 2001 at 19:34:03, Ricardo Gibert wrote: >> >>>On February 09, 2001 at 19:04:45, David Rasmussen wrote: >>> >>>>On February 09, 2001 at 10:39:09, Pat King wrote: >>>> >>>>>On February 07, 2001 at 10:59:31, Pat King wrote: >>>>> >>>>>>I have seen it written here that with 64 bit Zobrist hashing, the perfect key >>>>>>should change 32 bits. When I had what I thought to be hashing problems, I >>>>>>captured some stats on my hash keys. I found that most of them changed 28-36 >>>>>>bits (within 4) with a few outliers as far as 13 bits from "perfection". I also >>>>>>checked that I was not generating duplicate keys. How good or bad is this? >>>>>>Should I work on the average, or the outliers? Any comments appreciated :) >>>>>> >>>>>>Pat >>>>>Thanks to all for your thoughtful replies. For what it's worth, the only change >>>>>I've made is to generate a key set with hamming distance 31-33, with a >>>>>significant improvement in hash performance (thanks to Ricardo for providing a >>>>>reasonable argument to justify the 32 bit standard). I have yet to compare this >>>>>set with Rob's 16 bit criteria for XORing any 2 keys. >>>>> >>>>>Pat >>>> >>>>But 32 isn't optimal. As high as possible is optimal. >>> >>> >>>So far the conclusions I've come up with are: >>> >>>(1) A Key = 0 should be excluded, since positions with a particular piece on a >>>particular square will produce the same hash signature as the same position >>>without that piece. >>>(2) Keys with a close hamming distance to zero should be also be excluded. >>>(3) Pairs of keys of the form x and ~x, since 2 such pairs will XOR to zero. >>>(4) Pairs with a close hamming distance to x and ~x should also be excluded. >>>(5) A key set with a hamming distance > 21 and < 43 with respect to each other >>>*and* to zero should satisfy (1)-(4) above can be quickly produced within the >>>program without having to read in a file with the keys. Just use the same random >>>number seed. >>> >>>I'm trying to find sources on the net to verify these conclusions, but so far no >>>luck. What do you think? >> >>I agree. Here's another way of looking at it--nothing really rigorous >>though. Numbers that flip too many bits are just as bad as numbers that >>flip too few. Two such numbers (that flip lots of bits) XOR'ed to the >>hashcode will be equivalent to XOR'ing a number that flips only a few bits >>since the 2nd number will tend to undo what the first did. >Which is why I was aiming for 32 per key, neither too much nor too little. It's >definitely an improvement over what I had. >> >>You really want each number to scramble the hashcode as much as possible so >>that positions that are "near" each other get very different hashcodes. >>(The idea being that all the positions under consideration during a search >>will sort of be near each other (relatively speaking). If we insure they >>map to very different hashcodes, then we will tend to avoid false matches >>despite the tremendous oversubscription to each hashcode.) >> >>If you have two numbers that flip lots of bits (have lots of ones), then >>they'll have lots of bits in common and have a small hamming distance. If >>two numbers are very different (approaching x and ~x) then their XOR will >>have lots of ones. So the effect of using numbers with too large a hamming >>distance is nearly the same as numbers with too small a hamming distance. >> >>My intuition tells me to split the difference and go with a hamming distance >>of 32 as optimum. Getting a set of 800 numbers all with a hamming distance >>of 32 from each other is likely (mathematically) impossible. And anyway, >>their pairwise combination won't have that characteristic... >Actually, I believe it will. If you look at it bit by bit, w/ 32 hamming dist, >P(0)=P(1)=0.5 >P(A&B)=0.25 >And if you look at xor truth table, 2 ways to get 0 and 1, for P=0.25+0.25=0.5 >So the 32 hamming dist is, in some sense, stable. Whether this is good, bad, or >indifferent... Well, I tried this out by hand, and at first it seemed to confirm this. I was able to construct a set of codes that were equidistant from each other, and sure enough, when XOR'ed pairwise those produced were also at the same hamming distance. The set that I constructed all had equal numbers of 0's and 1's. OTOH, if I take two numbers with unequal numbers of 0's and 1's, say A=011111111 and B=01110000 (8-bit for convenience), which have a hamming distance of 4, then A^B=00001111, and that has a hamming distance of 3 from A and 7 from B. Of course I might have some trouble making a set of more than two such numbers... Anyway, a set of 64-bit numbers with a mutual hamming distance of exactly 32 which only gives rise to other numbers with that same distance is actually a very bad set: you can't reach all the different possible 64-bit bit patterns by XOR'ing them together. You need a set that spans the space of hashcodes. Not really sure what all this means... -Dan. >> >>I guess what you really need is a set of numbers that are as "linearly >>independent" as possible so that it takes the combination of many of them >>to reproduce one of them. If you only had 64 different features to >>encode, you could have each of these numbers be completely independent >>of the other (not expressible as an XOR combination of the others) and >>each resultant hashcode would map one-to-one with the positions so >>encoded. But we have 800 possible features and don't want to use 800 >>bits. >> >>This error code (BCH) idea sounds interesting. Maybe someone will >>generate a nice optimal set of 800+ numbers that we all could use :). >> >>-Dan.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.