Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: zobrist hashing

Author: Robert Hyatt

Date: 18:38:32 05/29/99

Go up one level in this thread


On May 29, 1999 at 18:21:15, Ricardo Gibert wrote:

>Despite what hyatt implies, 'mapping' is not mutually exclusive of 'hashing'.
>His definition of hashing is really a better definition of a many to one
>mapping.  Hashing is a key scattering function to distribute keys more evenly.
>Computer scientists aren't very good at formalizing terminology, since they are
>of a more practical bent like engineers.  Hyatt's error is the same one made by
>many others e.g. Sedgewick, Horowitz, etc.  Knuth & Wirth are better examples of
>computer scientists that use terminology & definitions more correctly.
>>
>Another example is the term "perfect hash function".  Not an oxymoron.  A hash
>function is perfect if it does not produce a collision.  It is minimal perfect
>if it does not produce any "gaps" either.  Such "hash" functions exist, but
>usually change with each additional key and are costly to compute.  Algorithm's
>to find such "hash" functions are known that are polynomially bound.  Not
>practical for an application unless the key set is small and for the most part
>fixed.
>>
>What you are looking for are digital search trees (or radix serching), which
>following Hyatt's "definition", will 'hash' the keys the way you want ;-).  This
>has other drawbacks, however.  You may take a look at an article by Bentley in
>Dr Dobb's on trinary search trees, which is a radix search method.  Very
>interesting.  I believe he indicates it to be more efficient than hashing if I
>remember correctly.  Hard to believe.  The advantage of hashing is, it is simple
>and the hash value can be computed "differentially", which I imagine is useful
>chesswise.
>>
>By the way, I don't mean to be too hard on Professor Hyatt.  He is an
>outstanding computer scientist AND individual.  The time he spends patiently
>answering other peoples questions is amazing and commands respect, BUT as
>professor of computer science, he should be a bit more careful.  I hope a little
>scolding will do some good and he takes this in the spirit intended.


We will just have to agree to disagree.  The definition for 'hashing' that
is used in all books that I know of is always hash(M) -> N, where the size of
M is greater than the size of N.  Otherwise there is no need for hashing.  The
idea is always to take a big thing (ie a chessboard, a social security number,
a person's alphabetic name, a patient's medical record number, etc) and convert
it into a little thing that can be used to probe a directly-indexed thing.  All
of my books generally refer to an in-memory array as the target of the hashing
function.

The concept of hashing an M bit quantity into a different M bit quantity makes
no sense to me... since you use the M bit quantity as an index...  and if you
end up with the same size of a 'thing' then you have done _nothing_.  Because
you still need 2^M entries in your table...  and how things are spaced out
doesn't matter there.

I'd be happy to check any reference you want to send me to...  but the
definition I used for hashing has been used forever...  it is in all the
unix books where the Unix OS itself hashes block numbers into a small group
of buffer chains, etc...  What possible reason would you suggest for hashing
an M bit value into a different M bit value?  I'd have to understand the "why"
for such a definition before the "how" even is considered...



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.