Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: hash tables question

Author: Robert Hyatt

Date: 06:23:12 06/26/00

Go up one level in this thread


On June 26, 2000 at 00:43:26, blass uri wrote:

>On June 25, 2000 at 17:43:59, Robert Hyatt wrote:
>
>>On June 25, 2000 at 14:35:01, blass uri wrote:
>>
>>>On June 25, 2000 at 12:27:21, Robert Hyatt wrote:
>>>
>>>>On June 25, 2000 at 10:15:53, blass uri wrote:
>>>>
>>>>>On June 25, 2000 at 09:40:24, Robert Hyatt wrote:
>>>>>
>>>>>> On June 25, 2000 at 03:27:07, blass uri wrote:
>>>>>>
>>>>>>>Can somebody post a C program that translates arrays to 32 bits integers when
>>>>>>>usually different arrays get different numbers and also translates it in a way
>>>>>>>that it is easy to find if the 32 bits integer is new?
>>>>>>
>>>>>>arrays to a single int is possible, but it depends on what is in the array
>>>>>>as to how the elements might get 'hashed' together.  the last requirement is
>>>>>>very difficult to do (make it easy to find that the new hash signature has not
>>>>>>been computed before.)
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>I think that this is the idea behind hash tables
>>>>>>>
>>>>>>>I need it for my program that I use to solve equations and inequalities.
>>>>>>>
>>>>>>>I have an array possolution[256][100000] and
>>>>>>>I need to check if the possolution[256][i] is not identical to
>>>>>>>possolution[256][j]
>>>>>>>for all j<i(I do i++ only if it is not identical).
>>>>>>
>>>>>>this is not a hashing application.  When you hash, you have the possibility
>>>>>>of a collision (two different source strings producing the same hash
>>>>>>signature).
>>>>>
>>>>>I think that I did not explain myself clearly
>>>>>
>>>>>I need to check if
>>>>>possolution[0][i]....possolution[255][i] is identical to
>>>>>possolution[0][j]...possolution[255][j] for j=0,1...i-1
>>>>>
>>>>>hasing can help me because I know if the hash of the first sequence is a new
>>>>>number that that solution i is not identical to a previous solution without a
>>>>>lot of work.
>>>>
>>>>
>>>>
>>>>the problem is that 'hashing' is going to take a bunch of time itself...  you
>>>>could just xor all the values together into one integer.  If you find two with
>>>>the same XOR sum, then you would need to do the normal loop to be _sure_ they
>>>>are the same.  That would work fine...  hash doesn't match, no need to compare
>>>>all the values.  Hash does match, you then compare all values to make sure
>>>>you didn't get a collision.  Note that if you XOR all values together, abc and
>>>>cba will produce the same hash signature.
>>>
>>>I will try to do xor of all the numbers and I will see if it gives me a small
>>>number of collisions.
>>>if regular xor dos not work I may try to do xor of something like
>>>(possolution[i]<<i)
>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>  IE hashing isn't a pattern-matching algorithm...  it is almost
>>>>>>the opposite.  It compresses a large set of data into one value, although you
>>>>>>have to then handle the case where different sets of data produce the same
>>>>>>hash signature.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>If I can calculate the hash entry of possolution[256][i] and discover in a short
>>>>>>>time that the hash entry of possolution[256][i] is different than the hash entry
>>>>>>>of possolution[256][j] for j<i it will save my program a lot of time
>>>>>>>
>>>>>>>I need to know also how to do it in C with O(log[i]) steps and not in O(i) steps
>>>>>>>and I know only how to do it theoretically in O(log[i]) steps but I do not know
>>>>>>>how to do it in C because I do not know how to push an array forward(if I have
>>>>>>>an array hash[100000] I do not know how to do for (i=35000;i<90000;i++)
>>>>>>>hash[i]=hash[i+1] in a short time)
>>>>>>>
>>>>>>>Uri
>>>>>>
>>>>>>
>>>>>>About the only O(logn) approach to anything is a binary algorithm.  IE if your
>>>>>>'solutions' are sorted, then a binary search would search for a specific match
>>>>>>in O(logn) time.
>>>>>
>>>>>The problem is not that I do not know to search in o(log n) time but how to add
>>>>>the new number in the right place not in O(n) time.
>>>>>
>>>>>I do not know how to do it in less than n steps and I believe that it must be
>>>>>possible because when chess programs add a new position to the hash tables they
>>>>>must put it in the right place in the hash tables(otherwise I see no way for
>>>>>them to search fast if the position is a new position or probably an old
>>>>>position)
>>>>>
>>>>
>>>>
>>>>Chess programs store the data in a random position dictated by the low-order
>>>>N bits of the hash signature.  But we never 'search' this data.  We just probe
>>>>using (hopefully) the same hash signature to find a match, but we only look at
>>>>one (or a very small number) position(s) in the table.
>>>
>>>If you never search this data how do you know if the signature of a position is
>>>the same as the signature of previous position in the hash tables?
>>>
>>>You need to remember the signature of the previous positions and see if the
>>>signature is identical to a previous signature.
>>>
>>>How do you do it without search?
>>
>>I use the low order N bits of the hash signature as an index into the hash
>>table.  I go directly to that entry and check for a match.  If it doesn't
>>match, I quit.
>
>1)Is N a function of the size hash table and is bigger when you use more hash
>tables?
>2)Do you search all the position with the same last N bits to check for a match?
>3)Do you have 2^N pointers when every pointer give you a set of positions to
>check?
>
>Uri


(1)  N=log2(hash_table_size);

That is why the hash table has to be a power of 2 in crafty.

(2) there is only _one_ entry in the table at the index "N", where

N=hash_signature & log2(hash_table_size);

(3) no pointers at all.  The low order N bits of the signature is a distinct
address in the hash table.  I go to that entry, and if the signature there
matches the current signature, I get a "match".  If they don't match, I give up
and say "no match".



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.