Author: Robert Hyatt
Date: 07:24:17 09/25/01
Go up one level in this thread
On September 24, 2001 at 18:06:30, Sune Fischer wrote: >On September 24, 2001 at 16:24:01, Robert Hyatt wrote: > >>Who cares about the number of hash entries? We aren't scanning the entire table to find a false match. We index to one position only. Suppose the table was 2^64 in size so you could store the entire set of possible hash signatures. >>Does that increase the chances of a collision? > >Yes it does - as Uri has also pointed out. > >See, if you make a hash the size of 2^64 and fill it (!) >then you have no free keys left! >The next position key you generate will match one >of those in the table with 100% certainty! > >This is the most extreme case, in fact you will never be able >to update your hash with new positions because you are fooled >to believe you have calculated them all. You will have a very >high collision rate in this case. I disagree there. "high" is relative. There will be 2^96 different positions that each map to the same hash table entry. But the search space will not include _all_ of those unless we somehow find a way to search to the end of the game. It is possible that our search space will never include 2^64 total positions. And if the tree is bushy (as it is at present) rather than being narrow and deep, this further reduces the chances of a collision using the Zobrist approach we now use. So far, the sum-total of all memory manufactured has not reached 2^64. It won't for a long time. Much less putting it on one machine to hold 2^64 hash entries. Even searching 4 B nodes per second, still will require 4B seconds to fill such a table. We are many years away from 4B speeds. The shallower the search, the lower the probability of a collision. IE for a 1 ply search, _no_ collisions will be possible. Ditto for 2 plies. It takes a lot of depth to perturb the hash signature, then un-perturb it back to its original state to get a collision. A _lot_ of depth... I believe Burton Wendroff published some analysis on this in an older JICCA issue. I believe it suggested that full-width 20 ply searches are perfectly safe and free of collisions with properly chosen Zobrist random numbers. > >If you only have a small hash of 2^10 entries, then you "only" >have logged 2^10 keys and so there are many free ones left, and >you will be able to update the keys that don't match. > >Look at it this way: >You have N keys saved in the hash. >You generate a new key matching a bran new position never seen before. >What are the chances that this key collides with one in the hash? >Well there are 2^64 combinations and N "bad" ones you want to avoid: >chances are N/2^64 of a collision. > > >> Suppose your table has 1 >>entry. you can _still_ get a collision. > >Yes, but odds will only be 1/2^64. > It has to be higher. There is a definite probability, with the way the table is updated and probed, that I search two out of a set of 2^96 colliding entries before I search anything else. And I don't believe that is particularly unlikely once we reach the NPS values needed to make this discussion meaningful. For the moment, it is _all_ very moot. We can't search anywhere near deep enough to produce a significant number of 64 bit collisions. As even 4B nodes is a _huge_ search and is nowhere near filling a 2^64 table. For 2^64 the number of potential collisions is huge. For a program that can't search 20 plies full-width in the middlegame, the number of actual collisions is near to zero with reasonable random numbers. >>I think you are trying to consider "time" in the equation here. The farther >>apart two entries are stored in time, the less chance they will falsely match >>since the first has a greater chance of getting overwritten before the second >>is reached. The chances of a collision with 64 bits are so remote, I wouldn't >>give it a moment's thought. Because even with a _full_ 64 bit address space >>for the table, the probability of a false match is one out of 2^64. Very >>small. Because for any 64 bit hash signature, there is only _one_ place I >>will try to find it stored in the table, regardless of how large the table >>is. With that small a chance to get a false match with a full address space, >>smaller tables only reduce the chances further due to overwrites. > >Read some of Uri's posts in this thread, he expains it very well. > >-S.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.