Author: Robert Hyatt
Date: 14:24:43 04/12/02
Go up one level in this thread
On April 12, 2002 at 15:44:46, Will Singleton wrote: >On April 12, 2002 at 10:43:42, Pat King wrote: > >>On April 10, 2002 at 20:47:06, Will Singleton wrote: >> >>>On April 10, 2002 at 17:31:20, Andrew Williams wrote: >>> >>>>On April 10, 2002 at 15:28:26, Pat King wrote: >>>> >>>>>On April 09, 2002 at 12:11:08, Andrew Williams wrote: >>>>> >>>>>>On April 09, 2002 at 11:56:45, Pat King wrote: >>>>>> >>>>>>>On April 08, 2002 at 15:17:46, Robert Hyatt wrote: >>>>>>> >>>>>>>>How do you define "collision"? Normally this means two different board >>>>>>>>positions produce the same hash signature. If you are getting _those_ you >>>>>>>>have a Zobrist hashing bug. >>>>>>> >>>>>>>How do you detect collisions, then? Record the entire position in the hash >>>>>>>table? Even for a debug build, that seems like it would be impractical. >>Although undetected hash bugs are even less practical :) >>>>>>> >>>>>> >>>>>>You check to see if the move stored in the hash entry is legal or not. If >>>>>>the hash keys match but the move isn't legal in the position, it means that >>>>>>you have a hash collision. >>>>>> >>>>>>Andrew >>>>> >>>>>But you rely on the hashed positions being different enough for that move to be >>>>>illegal. Certainly, your technique detects collisions, but not all collisions!? >>>>> >>>>>Pat >>>> >>>>Yes. I should have been clearer. I don't have any code in PM to detect >>>>collisions. As you say, what I described wouldn't catch all collisions. >>>> >>>>Andrew >>> >>>As always seem to be the case in computer chess, there are competing definitions >>>for the same terms (collisions of collisions, if you will). >>> >>>In the Dec 1996 issue of ICCAJ, Beal and Smith discuss collisions in their >>>article on multiple-probe hashing. Your definition of a collision, if I read >>>you correctly, is what Beal & Smith would call a false-match error (negligible, >>>in their opinion). Collisions, otoh, they say occur when two positions gen the >>>same hash >index<, but different keys. This, they say, can occur frequently, >>>and requires a replacement scheme to determine which is kept and which >>>discarded. >> >>The definition of collision being used in this thread was clearly spelled out by >>Bob, and lurks above in all the re:s above. >>It is especially relevant to this thread, where we humble beginners (please >>don't notice how long I've been in that category) struggle to get even the >>basics right. > > >What Bob Said was this: >"How do you define "collision"? Normally this means two different board >positions produce the same hash signature. If you are getting _those_ you >have a Zobrist hashing bug. If you mean two different signatures hash to the >same board position then perhaps your 64 bit random numbers are very poorly >chosen..." > >I think what he meant by the second sentence was: if two different keys hash to >the same spot in the table, then your numbers aren't random enough. > >This would follow the definition of collision that has been used in the articles >I have read. That is sometimes called a "collision" but it is incorrectly done. A collision is technically "two unique positions that produce the same hash signature" In normal hashing (not chess) you take something (say a social security number) and hash it to something used to address a table for random access. In chess, we hash a position to a signature, then we hash the signature (usually by ANDing with a mask) to reduce it to a table index. This second kind of collision (two hash signatures yield the same table index when ANDed with some mask) must be handled as you mention. It boils down to an effective replacement policy. The first kind of hashing collision can not be tolerated, and with 64 bit keys should not happen if the Zobrist random numbers are chosen very well... > But it seems clear, to me, that from my testing, collisions are >common and must be managed by an appropriate replacement scheme. So I don't >know why Bob and others seem to adhere to the idea that collisions only occur >rarely, or if you have a bug. > Mainly because when we use the term "collision" we mean two unique chess positions that produce the same hash signature. And that is both a problem that will cause trouble, and based on lots of 64 bit testing, is probably also a bug. >If a collision is two different board positions producing the same full 64-bit >hash key, then what do you call the event in which two different board positions >produce different keys, but map to the same hash slot? As I initially said, in "normal" hashing, the "signature" step is not used directly. You hash something to an index and go there. We are doing it in two steps and _either_ can be called a collision. However, in my case, I don't call two different signatures yielding the same table index a collision, I call that something that just happens naturally. It isn't a bug at all assuming the table index distribution seems to be pretty uniform. One common error I used to see was people building zobrist random numbers by jamming two 32 bit floats together. Unfortunately, that didn't produce a 64 bit random number because 9 of the bits (sign + exponent) are not very random. That led to "clustering of indices" with many "gaps". Ideally you'd like your hash signatures to distribute addresses all over the table in a very uniform way. And generally good Zobrist numbers will do this effectively... > >> >>> >>>Their definition implies than collisions only occur during hash store, but it >>>seems correct that they can also occur during probes. That is, you probe based >>>on the index, but find a different key stored there. >>> >>>So, it appears that collisions can occur during probe and store, and are reduced >>>by multiple entries per index (if you count only failures after exhausting the >>>multiple probes/stores). Reductions also are made with increasing the hash >>>size, and possibly the hamming distance in the random number list (though this >>>isn't clear, as I get better results with maximizing the hamming distance >>>between the indexes rather than the full number). >>> >>>I seem to get a lot of collisions, but I don't really know what a lot of >>>collisons are, since people count things (and define things) differently. What >>>we need is a uniform definition for every chess term. This would help beginning >>>and experienced programmers alike. >>> >>>Will
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.