Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: hash tables question

Author: blass uri

Date: 21:43:26 06/25/00

Go up one level in this thread


On June 25, 2000 at 17:43:59, Robert Hyatt wrote:

>On June 25, 2000 at 14:35:01, blass uri wrote:
>
>>On June 25, 2000 at 12:27:21, Robert Hyatt wrote:
>>
>>>On June 25, 2000 at 10:15:53, blass uri wrote:
>>>
>>>>On June 25, 2000 at 09:40:24, Robert Hyatt wrote:
>>>>
>>>>> On June 25, 2000 at 03:27:07, blass uri wrote:
>>>>>
>>>>>>Can somebody post a C program that translates arrays to 32 bits integers when
>>>>>>usually different arrays get different numbers and also translates it in a way
>>>>>>that it is easy to find if the 32 bits integer is new?
>>>>>
>>>>>arrays to a single int is possible, but it depends on what is in the array
>>>>>as to how the elements might get 'hashed' together.  the last requirement is
>>>>>very difficult to do (make it easy to find that the new hash signature has not
>>>>>been computed before.)
>>>>>
>>>>>
>>>>>>
>>>>>>I think that this is the idea behind hash tables
>>>>>>
>>>>>>I need it for my program that I use to solve equations and inequalities.
>>>>>>
>>>>>>I have an array possolution[256][100000] and
>>>>>>I need to check if the possolution[256][i] is not identical to
>>>>>>possolution[256][j]
>>>>>>for all j<i(I do i++ only if it is not identical).
>>>>>
>>>>>this is not a hashing application.  When you hash, you have the possibility
>>>>>of a collision (two different source strings producing the same hash
>>>>>signature).
>>>>
>>>>I think that I did not explain myself clearly
>>>>
>>>>I need to check if
>>>>possolution[0][i]....possolution[255][i] is identical to
>>>>possolution[0][j]...possolution[255][j] for j=0,1...i-1
>>>>
>>>>hasing can help me because I know if the hash of the first sequence is a new
>>>>number that that solution i is not identical to a previous solution without a
>>>>lot of work.
>>>
>>>
>>>
>>>the problem is that 'hashing' is going to take a bunch of time itself...  you
>>>could just xor all the values together into one integer.  If you find two with
>>>the same XOR sum, then you would need to do the normal loop to be _sure_ they
>>>are the same.  That would work fine...  hash doesn't match, no need to compare
>>>all the values.  Hash does match, you then compare all values to make sure
>>>you didn't get a collision.  Note that if you XOR all values together, abc and
>>>cba will produce the same hash signature.
>>
>>I will try to do xor of all the numbers and I will see if it gives me a small
>>number of collisions.
>>if regular xor dos not work I may try to do xor of something like
>>(possolution[i]<<i)
>>
>>>
>>>
>>>
>>>
>>>>
>>>>
>>>>  IE hashing isn't a pattern-matching algorithm...  it is almost
>>>>>the opposite.  It compresses a large set of data into one value, although you
>>>>>have to then handle the case where different sets of data produce the same
>>>>>hash signature.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>If I can calculate the hash entry of possolution[256][i] and discover in a short
>>>>>>time that the hash entry of possolution[256][i] is different than the hash entry
>>>>>>of possolution[256][j] for j<i it will save my program a lot of time
>>>>>>
>>>>>>I need to know also how to do it in C with O(log[i]) steps and not in O(i) steps
>>>>>>and I know only how to do it theoretically in O(log[i]) steps but I do not know
>>>>>>how to do it in C because I do not know how to push an array forward(if I have
>>>>>>an array hash[100000] I do not know how to do for (i=35000;i<90000;i++)
>>>>>>hash[i]=hash[i+1] in a short time)
>>>>>>
>>>>>>Uri
>>>>>
>>>>>
>>>>>About the only O(logn) approach to anything is a binary algorithm.  IE if your
>>>>>'solutions' are sorted, then a binary search would search for a specific match
>>>>>in O(logn) time.
>>>>
>>>>The problem is not that I do not know to search in o(log n) time but how to add
>>>>the new number in the right place not in O(n) time.
>>>>
>>>>I do not know how to do it in less than n steps and I believe that it must be
>>>>possible because when chess programs add a new position to the hash tables they
>>>>must put it in the right place in the hash tables(otherwise I see no way for
>>>>them to search fast if the position is a new position or probably an old
>>>>position)
>>>>
>>>
>>>
>>>Chess programs store the data in a random position dictated by the low-order
>>>N bits of the hash signature.  But we never 'search' this data.  We just probe
>>>using (hopefully) the same hash signature to find a match, but we only look at
>>>one (or a very small number) position(s) in the table.
>>
>>If you never search this data how do you know if the signature of a position is
>>the same as the signature of previous position in the hash tables?
>>
>>You need to remember the signature of the previous positions and see if the
>>signature is identical to a previous signature.
>>
>>How do you do it without search?
>
>I use the low order N bits of the hash signature as an index into the hash
>table.  I go directly to that entry and check for a match.  If it doesn't
>match, I quit.

1)Is N a function of the size hash table and is bigger when you use more hash
tables?
2)Do you search all the position with the same last N bits to check for a match?
3)Do you have 2^N pointers when every pointer give you a set of positions to
check?

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.