Author: Dan Newman
Date: 01:12:42 02/23/99
Go up one level in this thread
On February 22, 1999 at 14:12:58, Robert Hyatt wrote: >On February 22, 1999 at 11:14:14, Robert Hyatt wrote: > >>On February 22, 1999 at 02:30:37, Don Dailey wrote: >> >>>On February 21, 1999 at 15:58:16, Robert Hyatt wrote: >>> >>>>On February 21, 1999 at 13:23:24, William Bryant wrote: >>>> >>>>>On February 21, 1999 at 10:21:50, Robert Hyatt wrote: >>>>> >>>>>>On February 21, 1999 at 01:14:37, KarinsDad wrote: >>>>>> >>>>>>>I posted this the other day, but it may have been obscured (or not if nobody has >>>>>>>any information on it). >>>>>>> >>>>>>>I have been wondering if changing the Zobrist hash from a set of random number >>>>>>>to a set of non-random and very specific numbers could result in a more even >>>>>>>distribution in the hash table. Has anyone done any work in this area? >>>>>>> >>>>>>>KarinsDad >>>>>> >>>>>> >>>>>>Burt Wendroff and Tony Warnock published a paper in the JICCA a few years >>>>>>ago discussing this. The main issue is to find the hamming distances between >>>>>>_all_ the random numbers, and maximize all of them. Random numbers work fine >>>>>>if they have been tested (I ran all of mine thru a hamming distance analysis >>>>>>when I first started). >>>>>> >>>>>>As far as 'distribution' this is done by the low order N bits >>>>>>(N=log2(hash_size)) of the hash signature. You _could_ do the hamming analysis >>>>>>on just those bits as well. >>>>> >>>>>I check all my random numbers for uniqueness, ie no duplicates,and use >>>>>0xFFFFFFFFFFFFFFFF for the side to move (compliment the side to move), >>>>>but would appreciate information on 'hamming distance analysis' in not to >>>>>complicated to post or send via email. >>>>> >>>>>Thank you, >>>>> >>>>>William >>>>>wbryant@ix.netcom.com >>>> >>>> >>>>"hamming distance" is simply the number of bits two numbers differ in. To >>>>compute the hamming distance for a and b, you just compute popcnt(a^b). Two >>>>numbers can differ in all bits. But more than two can not... I shoot for >>>>no worse than 32 bits that are the same in two random numbers... Assuming >>>>64 bit values... >>> >>>Are you sure NO two are closer (in hamming distance) than 32 bits? How >>>many numbers are in your table? Are you talking about the worst >>>case or is it some kind of average? >> >> >>I don't do this in Crafty. I do (did) in Cray Blitz, because the trees we >>searched were far larger. And what I did there was to generate the random >>numbers separately, and each new number had to pass the test against all the >>previous numbers before it was accepted. And yes, this was time-consuming so >>we made these numbers 'constant'. And no, I am not sure that all the numbers >>were separated by a hamming distance of 32... that goes back in time far enough >>to make it very fuzzy. >> >>However, I don't have that many numbers (1024 you said?) as I only used 12*64 >>since we didn't use any sort of 'boundary squares' in cray blitz. Ditto for >>Crafty where I also use 12*64. I have been meaning to go grab that table of >>numbers and 'steal' it for crafty, but I haven't yet, because it is in the >>syntax of 'fortran'. >> >> >> >> >>> >>>I have a little piece of code I wrote that generates 64 bit random >>>numbers one at a time and tests each one against all the ones >>>previously generated. If the new number is closer than my specified >>>hamming distance, I regenerate that number until it is. To get >>>even 24 bit distances between any two I've had to generate almost >>>half a million numbers. I have 1024 numbers in my table. >>> >>>I don't believe your random number generator is returning numbers >>>this good. Maybe you can precompute them this good, I don't know. >>>I'm trying right now with 32 bits but it's going awfully slow. >> >>I use the Numerical Recipes RNG code. But you are right, it won't produce >>such good hamming distances quickly. I wouldn't be surprised if it takes >>4 billion numbers to get decent random numbers... >> > >I ran some quick tests... to produce a minimum distance of 16, producing >768 random numbers (64 bits), takes 769 numbers from my Random64() function >in crafty. To go to 24 jumps this to a really big number (it took my code >200 million random numbers to find 768 with a hamming distance >= 24 between >_all_ of them. > >One minor trick is that in general, if N is a good number, ~N is also good. > I wonder if this is the case? Say you have a bunch of numbers A, B, C, ... and also ~A, ~B, ~C, ..., then A ^ ~A == B ^ ~B == C ^ ~C == ~0. It seems like a less than desirable result to get the same bit pattern from pairwise combinations of these numbers: you are more likely to get the same hashcode for different positions. What you really want to do is select a bunch of numbers that have as large a degree of "linear" independence as you can manage. You get maximal independence when each number has a different bit turned on: 0000000000000000000000000000000000000000000000000000000000000001 0000000000000000000000000000000000000000000000000000000000000010 0000000000000000000000000000000000000000000000000000000000000100 ... 1000000000000000000000000000000000000000000000000000000000000000 but of course that leaves you with too few numbers. So you must have more bits turned on. That means that it will be possible to express some members of this set of numbers as a "linear" combination of other members: X = A ^ C ^ F ^ G ^ ... You really want to make that as difficult as possible. If members of the set of numbers have too many bits in common (close hamming distance), then you will have trouble -- it becomes "easier" to get the same hashcode from different combinations of numbers. But the same is true if the set contains members that have too few bits in common (as when you include complements). My intuition tells me that trying for a hamming distance of 32 is ideal -- but I don't really know... (I use a sieve that keeps the hamming distance between 16 and 48 for each pairing. It gets very slow near the end...) You could also look at this as an exercise in bit flipping. If your numbers flip too many bits or too few when they are XOR'ed into the hashcode then their effects may be too easily reversed by XOR'ing in other numbers. My intuition is that what you really need is a bunch of 64 bit numbers with exactly 32 bits turned on, all of which have a decent but not too large hamming distance between them. I find it interesting that the Zobrist trick works at all. Given the number of possible chess positions versus the number of 64-bit bit patterns, one can see that that space is vastly over subscribed: if there are 10**40 chess positions and 2**64 == 10**9 hashcodes, that means that there are approximately 10**31 positions for each hashcode on average. I think what makes it work is that during a search we may only hit 10**6 to 10**8 different but closely related positions, and those random bit patterns tend to make even closely related positions have vastly different hashcodes. If you were to choose positions at random from the full set of possible positions, you would expect to find numerous hashing errors in very short order. And yet this is an exceedingly rare event when doing a tree search. -Dan. >I've been trying to find my notes on CB... but now think that maybe '32' was >an 'average' number rather than the abs minimum for all the numbers. > >If I had some time, I'd try to study just what the 'optimal' set would be, >because it must be computable... > > > >>I'll try to figure out a way to post my values here if you are interested. >>64 * 12 isn't horribly big, but when you display them as 64 bit values, they >>are big. The main issue is how to initialize such an array in a portable >>way... IE (BITBOARD ran[12][64] = {{1,2,3},{4,5,6}}; The numbers 1,2,3 >>default to 'int' which is bad. I don't know what to do for all compilers >>(IE 123ULL or (BITBOARD) 123) >> >> >> >> >> >>> >>>So my code is either wrong or I'm not understanding your method. >>>Can you either clarify or send me your table to analyze? >>> >>> >>>- Don >>> >>> >>> >>> >>> >>>- Don
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.