Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: hash tables question

Author: Robert Hyatt

Date: 14:43:59 06/25/00

Go up one level in this thread


On June 25, 2000 at 14:35:01, blass uri wrote:

>On June 25, 2000 at 12:27:21, Robert Hyatt wrote:
>
>>On June 25, 2000 at 10:15:53, blass uri wrote:
>>
>>>On June 25, 2000 at 09:40:24, Robert Hyatt wrote:
>>>
>>>> On June 25, 2000 at 03:27:07, blass uri wrote:
>>>>
>>>>>Can somebody post a C program that translates arrays to 32 bits integers when
>>>>>usually different arrays get different numbers and also translates it in a way
>>>>>that it is easy to find if the 32 bits integer is new?
>>>>
>>>>arrays to a single int is possible, but it depends on what is in the array
>>>>as to how the elements might get 'hashed' together.  the last requirement is
>>>>very difficult to do (make it easy to find that the new hash signature has not
>>>>been computed before.)
>>>>
>>>>
>>>>>
>>>>>I think that this is the idea behind hash tables
>>>>>
>>>>>I need it for my program that I use to solve equations and inequalities.
>>>>>
>>>>>I have an array possolution[256][100000] and
>>>>>I need to check if the possolution[256][i] is not identical to
>>>>>possolution[256][j]
>>>>>for all j<i(I do i++ only if it is not identical).
>>>>
>>>>this is not a hashing application.  When you hash, you have the possibility
>>>>of a collision (two different source strings producing the same hash
>>>>signature).
>>>
>>>I think that I did not explain myself clearly
>>>
>>>I need to check if
>>>possolution[0][i]....possolution[255][i] is identical to
>>>possolution[0][j]...possolution[255][j] for j=0,1...i-1
>>>
>>>hasing can help me because I know if the hash of the first sequence is a new
>>>number that that solution i is not identical to a previous solution without a
>>>lot of work.
>>
>>
>>
>>the problem is that 'hashing' is going to take a bunch of time itself...  you
>>could just xor all the values together into one integer.  If you find two with
>>the same XOR sum, then you would need to do the normal loop to be _sure_ they
>>are the same.  That would work fine...  hash doesn't match, no need to compare
>>all the values.  Hash does match, you then compare all values to make sure
>>you didn't get a collision.  Note that if you XOR all values together, abc and
>>cba will produce the same hash signature.
>
>I will try to do xor of all the numbers and I will see if it gives me a small
>number of collisions.
>if regular xor dos not work I may try to do xor of something like
>(possolution[i]<<i)
>
>>
>>
>>
>>
>>>
>>>
>>>  IE hashing isn't a pattern-matching algorithm...  it is almost
>>>>the opposite.  It compresses a large set of data into one value, although you
>>>>have to then handle the case where different sets of data produce the same
>>>>hash signature.
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>If I can calculate the hash entry of possolution[256][i] and discover in a short
>>>>>time that the hash entry of possolution[256][i] is different than the hash entry
>>>>>of possolution[256][j] for j<i it will save my program a lot of time
>>>>>
>>>>>I need to know also how to do it in C with O(log[i]) steps and not in O(i) steps
>>>>>and I know only how to do it theoretically in O(log[i]) steps but I do not know
>>>>>how to do it in C because I do not know how to push an array forward(if I have
>>>>>an array hash[100000] I do not know how to do for (i=35000;i<90000;i++)
>>>>>hash[i]=hash[i+1] in a short time)
>>>>>
>>>>>Uri
>>>>
>>>>
>>>>About the only O(logn) approach to anything is a binary algorithm.  IE if your
>>>>'solutions' are sorted, then a binary search would search for a specific match
>>>>in O(logn) time.
>>>
>>>The problem is not that I do not know to search in o(log n) time but how to add
>>>the new number in the right place not in O(n) time.
>>>
>>>I do not know how to do it in less than n steps and I believe that it must be
>>>possible because when chess programs add a new position to the hash tables they
>>>must put it in the right place in the hash tables(otherwise I see no way for
>>>them to search fast if the position is a new position or probably an old
>>>position)
>>>
>>
>>
>>Chess programs store the data in a random position dictated by the low-order
>>N bits of the hash signature.  But we never 'search' this data.  We just probe
>>using (hopefully) the same hash signature to find a match, but we only look at
>>one (or a very small number) position(s) in the table.
>
>If you never search this data how do you know if the signature of a position is
>the same as the signature of previous position in the hash tables?
>
>You need to remember the signature of the previous positions and see if the
>signature is identical to a previous signature.
>
>How do you do it without search?

I use the low order N bits of the hash signature as an index into the hash
table.  I go directly to that entry and check for a match.  If it doesn't
match, I quit.




>
>I know that chess programs use 64 bits signature and using an array to tell you
>if a new signature appeared is impossible because computers do not have 2^64
>bits in the memory.
>
>Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.