Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hashtables: is larger always better?

Author: Robert Hyatt

Date: 08:57:36 09/25/01

Go up one level in this thread


On September 25, 2001 at 10:48:46, Uri Blass wrote:

>On September 25, 2001 at 10:24:17, Robert Hyatt wrote:
>
>>On September 24, 2001 at 18:06:30, Sune Fischer wrote:
>>
>>>On September 24, 2001 at 16:24:01, Robert Hyatt wrote:
>>>
>>>>Who cares about the number of hash entries?  We aren't scanning the entire table to find a false match.  We index to one position only.  Suppose the table was 2^64 in size so you could store the entire set of possible hash signatures.
>>>>Does that increase the chances of a collision?
>>>
>>>Yes it does - as Uri has also pointed out.
>>>
>>>See, if you make a hash the size of 2^64 and fill it (!)
>>>then you have no free keys left!
>>>The next position key you generate will match one
>>>of those in the table with 100% certainty!


That is correct.  But the way zobrist works, it is far more likely that
that position is a _real_ match rather than a false match...  So that once
you fill the table, you can't assume that _every_ probe from that point
forward is a false match.  very few will be in fact...

Because of the way the 2^160 total positions are separated into "groups".
From the opening position I will _never_ search positions with king and
pawn vs king.  All those collisions are impossible.


>>>
>>>This is the most extreme case, in fact you will never be able
>>>to update your hash with new positions because you are fooled
>>>to believe you have calculated them all. You will have a very
>>>high collision rate in this case.
>>
>>I disagree there.  "high" is relative.  There will be 2^96 different positions
>>that each map to the same hash table entry.  But the search space will not
>>include _all_ of those unless we somehow find a way to search to the end of
>>the game.
>
>For every entry the search space does not need to include all of the 2^96
>positions but only 2 of them to get a collision.
>
>If you have 2*2^64 different positions in the search space then you have at
>least 2^64 collisions in that theoretic case.

I don't believe any chess program in the forseeable future will have a search
space of 2^64 or larger.  We barely have a search space of 2^32 for reasonable
searches of one hour...  4 billion hours of search?  That is 1/2 million
_years_.

Zobrist hashing localizes the search space so that probes in one search space
(the original space already searched) is not likely to collide with another
search space that we can't reach from the initial position we searched from...



>
>The reason is that you have 2^64 entries in the hash tables and every hash entry
>has an average number of 2 positions.


Your math is leaving off the probability of searching the _right_ position
vs searching a position that collides with the right position.  This case is
far less likely.


>
>Suppose every entry has exactly 2 positions then you get exactly one collision
>in every entry that means 2^64 collisions

That doesn't work.  If I search 2^64 positions, there is really no chance at
all that there will be 2^64 _unique_ positions.  _most_ will be duplicated
and hence this doesn't cause a problem.  To search 2^64 unique positions will
require a search space _far_ larger than that to take care of _real_
transpositions to the same position from different search paths...





>
>other cases can be only worse(for example 4 positions in one entry and 0
>positions in another entry mean 3 collisions when 2 positions in every one of
>them mean 2 collisions)
>
>2^64 collisions when you have 2*2^64 positions means that in half of the cases
>you got hash collision so the probability is higher because of the size of the
>hash tables.

Again, the basic premise here is flawed.  If I were searching only unique
positions, this would be true, maybe.  But I am not, and it is not.  For 12
ply searches something like 20% of the positions are transpositions, at least,
maybe more if the tables were larger.  For 20 plies, this number will go up
significantly, as the longer the paths, the more transpositions there are.

The chance of an undetected collision is remotely small.  So small that the
advantage of a larger hash table is _not_ outweighed by a very small increase
in the chance of collisions.  One day, maybe.  Once we could do 32 bit hash
keys and get away with it.  64 bits will hold for 500 centuries of searching at
today's speeds.  That's good enough to say "it isn't an issue."



>
>I am not sure if my explanation is clear and maybe other people can help to
>explain it more clearly.
>
>Uri


your explanation is fine.  But it is based on things that are actually not
likely to happen...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.