Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Questions about hash table implementation?

Author: Robert Hyatt

Date: 19:37:30 05/22/01

Go up one level in this thread


On May 22, 2001 at 20:21:37, Pham Minh Tri wrote:

>On May 22, 2001 at 09:43:49, Robert Hyatt wrote:
>
>>On May 22, 2001 at 08:03:51, Pham Minh Tri wrote:
>>
>>>Hi all,
>>>
>>>I implemented my hash table by learning from some documents and old posts. It
>>>was a mixture of 32 and 64 bit. It worked but was not good performance. I have
>>>revised it and found some problems, as well as some misunderstands and lack of
>>>knowledge. Now I want to rewrite it in 64 bit. My questions are:
>>>
>>>1) It is necessary to have both hash key and hash signature 64 bit? Could one of
>>>them be 32 bit?
>>
>>
>>I assume you are using "hash key" to be that part of the hash signature you
>>actually store in the table?  If so, this might be pretty safe, so long as you
>>use one end of the signature to store, and the other end to compute the table
>>address.  For small hash tables, you do take a greater chance on getting a
>>collision of course.
>>
>
>I have used a hash key 32 bit and a hash signature 32 bit (same method for
>computing). I used the hash key for finding the table entry, and store the hash
>signature to check. Do you mean only one hash key 64 bit could replace my 2
>numbers and be better?


On a 64 bit machine, yes, the 64 bit single-value would be better/faster.
On 32 bit machines, it is a wash.



>
>>
>>
>>>
>>>2) How to generate a 64 bit random number? (I use ran()*ran(). Is it OK?).
>>
>>I would use ran()<<32 + ran myself.
>>
>>
>>>
>>>3) Which set of random numbers is "bad" for chess? How to generate a "good" set
>>>of random numbers? Is it necessary to filter (prune) some "bad" numbers?
>>
>>
>>Not really.  You can try to optimize the hamming distance between each pair,
>>but that is computationally expensive at setup time and I'm not sure it is worth
>>the trouble.  Lots of duplicates will wreck things of course.
>>
>>
>>
>>>
>>>4) To start, I usually use the command srand((unsigned)time(NULL)) - is it good
>>>or dangerous? Should I use a "good" const?
>>
>>
>>I would use the default for lots of reasons.  1. you will get a different set
>
>Actually, I use the random seed in hope that my program could play the little
>different games even they have the same openings. But I should agree with you
>because of other reasons.
>
>>of random numbers each time you run.  Which means you can't use the hash
>>signatures for your opening book;  2. You don't want an even random seed, and
>>you have a 50% chance of getting one with the above, maybe higher than that.
>>
>>
>>
>>
>>>
>>>5) How to map hash key into hash entries (which the number of hash entries is
>>>not 2^N)? (I use the operation %, but wonder if there is a better way).
>>
>>That's about it if you don't use a power of 2.
>>
>>
>>
>>>
>>>6) I use the operation addition for making new hash key. Certainly, it is slower
>>>than XOR, but anything else? (I heard one time about it, but forgot).
>>
>>XOR is better.  and easier to undo when you take back a move.  If you use
>>add, you can get overflows and when you later subtract, you can have a problem.
>>
>
>I do not really understand this point. If I subtract the same number, the
>variant should be recovered the old number, event last overflow of addition. Is
>it true?
>


If your hardware supports unsigned, add/subtract and xor are equivalent.
if you don't have a native unsigned add/subtract, then you can add and
subtract numbers and end up with a different value than you started with
depending on how the overflow condition is handled...





>>
>>>
>>>7) How to measure the "collision" (I little confuse about the dividend)? And
>>>other measurements?
>>
>>If you do 64 bit signatures, they will be rare enough to ignore.  The only
>>way to measure them is to store both the signature and a real compressed version
>>of the board.  When signatures match but the real board does not, you just had
>>a collision.  You don't do this except for testing, however...
>>
>
>Sorry, but IMO, this is type 1 error. The definition of collision is type 2,
>when 2 positions map onto the same entry, but different hash keys.
>

I don't know of anyone that defines collision as two different signatures
mapping to the same table entry.  Because that is going to happen M/N times
every search where M is total nodes searched and N is table size.  Most call
a "collision" the event that happens when two _different_ board positions
produce the same hash signature.  That will cause problems.

In my books, the "type 2" you mention only is used when you do not store the
_signature_ in the entry.  Then you are at the mercy of the contents of the
entry you map to. Storing the signature eliminates any consequence of this
kind of happening.






>>
>>>
>>>Many thanks for any help and suggestion.
>>>Pham
>
>Thank again everybody for your helps and information.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.