Author: Robert Hyatt
Date: 18:38:32 05/29/99
Go up one level in this thread
On May 29, 1999 at 18:21:15, Ricardo Gibert wrote: >Despite what hyatt implies, 'mapping' is not mutually exclusive of 'hashing'. >His definition of hashing is really a better definition of a many to one >mapping. Hashing is a key scattering function to distribute keys more evenly. >Computer scientists aren't very good at formalizing terminology, since they are >of a more practical bent like engineers. Hyatt's error is the same one made by >many others e.g. Sedgewick, Horowitz, etc. Knuth & Wirth are better examples of >computer scientists that use terminology & definitions more correctly. >> >Another example is the term "perfect hash function". Not an oxymoron. A hash >function is perfect if it does not produce a collision. It is minimal perfect >if it does not produce any "gaps" either. Such "hash" functions exist, but >usually change with each additional key and are costly to compute. Algorithm's >to find such "hash" functions are known that are polynomially bound. Not >practical for an application unless the key set is small and for the most part >fixed. >> >What you are looking for are digital search trees (or radix serching), which >following Hyatt's "definition", will 'hash' the keys the way you want ;-). This >has other drawbacks, however. You may take a look at an article by Bentley in >Dr Dobb's on trinary search trees, which is a radix search method. Very >interesting. I believe he indicates it to be more efficient than hashing if I >remember correctly. Hard to believe. The advantage of hashing is, it is simple >and the hash value can be computed "differentially", which I imagine is useful >chesswise. >> >By the way, I don't mean to be too hard on Professor Hyatt. He is an >outstanding computer scientist AND individual. The time he spends patiently >answering other peoples questions is amazing and commands respect, BUT as >professor of computer science, he should be a bit more careful. I hope a little >scolding will do some good and he takes this in the spirit intended. We will just have to agree to disagree. The definition for 'hashing' that is used in all books that I know of is always hash(M) -> N, where the size of M is greater than the size of N. Otherwise there is no need for hashing. The idea is always to take a big thing (ie a chessboard, a social security number, a person's alphabetic name, a patient's medical record number, etc) and convert it into a little thing that can be used to probe a directly-indexed thing. All of my books generally refer to an in-memory array as the target of the hashing function. The concept of hashing an M bit quantity into a different M bit quantity makes no sense to me... since you use the M bit quantity as an index... and if you end up with the same size of a 'thing' then you have done _nothing_. Because you still need 2^M entries in your table... and how things are spaced out doesn't matter there. I'd be happy to check any reference you want to send me to... but the definition I used for hashing has been used forever... it is in all the unix books where the Unix OS itself hashes block numbers into a small group of buffer chains, etc... What possible reason would you suggest for hashing an M bit value into a different M bit value? I'd have to understand the "why" for such a definition before the "how" even is considered...
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.