Author: Will Singleton
Date: 21:36:07 04/12/02
Go up one level in this thread
On April 12, 2002 at 17:24:43, Robert Hyatt wrote: >On April 12, 2002 at 15:44:46, Will Singleton wrote: > >>On April 12, 2002 at 10:43:42, Pat King wrote: >> >>>On April 10, 2002 at 20:47:06, Will Singleton wrote: >>> >>>>On April 10, 2002 at 17:31:20, Andrew Williams wrote: >>>> >>>>>On April 10, 2002 at 15:28:26, Pat King wrote: >>>>> >>>>>>On April 09, 2002 at 12:11:08, Andrew Williams wrote: >>>>>> >>>>>>>On April 09, 2002 at 11:56:45, Pat King wrote: >>>>>>> >>>>>>>>On April 08, 2002 at 15:17:46, Robert Hyatt wrote: >>>>>>>> >>>>>>>>>How do you define "collision"? Normally this means two different board >>>>>>>>>positions produce the same hash signature. If you are getting _those_ you >>>>>>>>>have a Zobrist hashing bug. >>>>>>>> >>>>>>>>How do you detect collisions, then? Record the entire position in the hash >>>>>>>>table? Even for a debug build, that seems like it would be impractical. >>Although undetected hash bugs are even less practical :) >>>>>>>> >>>>>>> >>>>>>>You check to see if the move stored in the hash entry is legal or not. If >>>>>>>the hash keys match but the move isn't legal in the position, it means that >>>>>>>you have a hash collision. >>>>>>> >>>>>>>Andrew >>>>>> >>>>>>But you rely on the hashed positions being different enough for that move to be >>>>>>illegal. Certainly, your technique detects collisions, but not all collisions!? >>>>>> >>>>>>Pat >>>>> >>>>>Yes. I should have been clearer. I don't have any code in PM to detect >>>>>collisions. As you say, what I described wouldn't catch all collisions. >>>>> >>>>>Andrew >>>> >>>>As always seem to be the case in computer chess, there are competing definitions >>>>for the same terms (collisions of collisions, if you will). >>>> >>>>In the Dec 1996 issue of ICCAJ, Beal and Smith discuss collisions in their >>>>article on multiple-probe hashing. Your definition of a collision, if I read >>>>you correctly, is what Beal & Smith would call a false-match error (negligible, >>>>in their opinion). Collisions, otoh, they say occur when two positions gen the >>>>same hash >index<, but different keys. This, they say, can occur frequently, >>>>and requires a replacement scheme to determine which is kept and which >>>>discarded. >>> >>>The definition of collision being used in this thread was clearly spelled out by >>>Bob, and lurks above in all the re:s above. >>>It is especially relevant to this thread, where we humble beginners (please >>>don't notice how long I've been in that category) struggle to get even the >>>basics right. >> >> >>What Bob Said was this: >>"How do you define "collision"? Normally this means two different board >>positions produce the same hash signature. If you are getting _those_ you >>have a Zobrist hashing bug. If you mean two different signatures hash to the >>same board position then perhaps your 64 bit random numbers are very poorly >>chosen..." >> >>I think what he meant by the second sentence was: if two different keys hash to >>the same spot in the table, then your numbers aren't random enough. >> >>This would follow the definition of collision that has been used in the articles >>I have read. > >That is sometimes called a "collision" but it is incorrectly done. A collision >is technically "two unique positions that produce the same hash signature" In >normal hashing (not chess) you take something (say a social security number) >and hash it to something used to address a table for random access. In chess, >we hash a position to a signature, then we hash the signature (usually by >ANDing with a mask) to reduce it to a table index. > >This second kind of collision (two hash signatures yield the same table index >when ANDed with some mask) must be handled as you mention. It boils down to >an effective replacement policy. The first kind of hashing collision can not >be tolerated, and with 64 bit keys should not happen if the Zobrist random >numbers are chosen very well... > >> But it seems clear, to me, that from my testing, collisions are >>common and must be managed by an appropriate replacement scheme. So I don't >>know why Bob and others seem to adhere to the idea that collisions only occur >>rarely, or if you have a bug. >> > >Mainly because when we use the term "collision" we mean two unique chess >positions that produce the same hash signature. And that is both a problem >that will cause trouble, and based on lots of 64 bit testing, is probably also >a bug. > > Right, in my recent testing, I never get those type of collisions. > >> >>If a collision is two different board positions producing the same full 64-bit >>hash key, then what do you call the event in which two different board positions >>produce different keys, but map to the same hash slot? > > >As I initially said, in "normal" hashing, the "signature" step is not used >directly. You hash something to an index and go there. We are doing it in >two steps and _either_ can be called a collision. However, in my case, I >don't call two different signatures yielding the same table index a collision, >I call that something that just happens naturally. It isn't a bug at all >assuming the table index distribution seems to be pretty uniform. OK. I guess we'll call it a "fender-bender." > >One common error I used to see was people building zobrist random numbers by >jamming two 32 bit floats together. Unfortunately, that didn't produce a 64 >bit random number because 9 of the bits (sign + exponent) are not very random. >That led to "clustering of indices" with many "gaps". Ideally you'd like your >hash signatures to distribute addresses all over the table in a very uniform >way. And generally good Zobrist numbers will do this effectively... > Yes. When I tried the shift/or with two 32-bit ints, I got a significant clustering of numbers, which could be plotted and seen easily with Excel. My current scheme gives a good distribution, and I'm happy with it. btw, thanks for addressing this. I think that people toss around chess terms that are not well-defined, with resulting confusion. I still think that the collision term is best defined by Beal, as I have indicated. Two hashkeys competing for the same table slot, or index, colliding. As you define it, is really a bug, and should not occur; hence the term is not meaningful. Will
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.