Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Questions about hash table implementation?

Author: Pham Minh Tri

Date: 23:11:06 05/22/01

Go up one level in this thread


On May 22, 2001 at 22:37:30, Robert Hyatt wrote:

>On May 22, 2001 at 20:21:37, Pham Minh Tri wrote:
>
>>On May 22, 2001 at 09:43:49, Robert Hyatt wrote:
>>
>>>On May 22, 2001 at 08:03:51, Pham Minh Tri wrote:
>>>
>>>>Hi all,
>>>>
>>>>I implemented my hash table by learning from some documents and old posts. It
>>>>was a mixture of 32 and 64 bit. It worked but was not good performance. I have
>>>>revised it and found some problems, as well as some misunderstands and lack of
>>>>knowledge. Now I want to rewrite it in 64 bit. My questions are:
>>>>
>>>>1) It is necessary to have both hash key and hash signature 64 bit? Could one of
>>>>them be 32 bit?
>>>
>>>
>>>I assume you are using "hash key" to be that part of the hash signature you
>>>actually store in the table?  If so, this might be pretty safe, so long as you
>>>use one end of the signature to store, and the other end to compute the table
>>>address.  For small hash tables, you do take a greater chance on getting a
>>>collision of course.
>>>
>>
>>I have used a hash key 32 bit and a hash signature 32 bit (same method for
>>computing). I used the hash key for finding the table entry, and store the hash
>>signature to check. Do you mean only one hash key 64 bit could replace my 2
>>numbers and be better?
>
>
>On a 64 bit machine, yes, the 64 bit single-value would be better/faster.
>On 32 bit machines, it is a wash.
>

I would like to make clear on this point. When I use two 32-bit hash values,
they are equal to a 64-bit hash key and work as many other implementations (do
many things with 2 32-bit variants seem to be easier than with 64-bit). The main
different is that I store only _half_ of this 64-bit hash key (store only 32-bit
hash signature). This saves much memory (1/4). However, I am still worrying
about its safety. Does anyone do a test/research about it?

>
>
>>
>>>
>>>
>>>>
>>>>2) How to generate a 64 bit random number? (I use ran()*ran(). Is it OK?).
>>>
>>>I would use ran()<<32 + ran myself.
>>>
>>>
>>>>
>>>>3) Which set of random numbers is "bad" for chess? How to generate a "good" set
>>>>of random numbers? Is it necessary to filter (prune) some "bad" numbers?
>>>
>>>
>>>Not really.  You can try to optimize the hamming distance between each pair,
>>>but that is computationally expensive at setup time and I'm not sure it is worth
>>>the trouble.  Lots of duplicates will wreck things of course.
>>>
>>>
>>>
>>>>
>>>>4) To start, I usually use the command srand((unsigned)time(NULL)) - is it good
>>>>or dangerous? Should I use a "good" const?
>>>
>>>
>>>I would use the default for lots of reasons.  1. you will get a different set
>>
>>Actually, I use the random seed in hope that my program could play the little
>>different games even they have the same openings. But I should agree with you
>>because of other reasons.
>>
>>>of random numbers each time you run.  Which means you can't use the hash
>>>signatures for your opening book;  2. You don't want an even random seed, and
>>>you have a 50% chance of getting one with the above, maybe higher than that.
>>>
>>>
>>>
>>>
>>>>
>>>>5) How to map hash key into hash entries (which the number of hash entries is
>>>>not 2^N)? (I use the operation %, but wonder if there is a better way).
>>>
>>>That's about it if you don't use a power of 2.
>>>
>>>
>>>
>>>>
>>>>6) I use the operation addition for making new hash key. Certainly, it is slower
>>>>than XOR, but anything else? (I heard one time about it, but forgot).
>>>
>>>XOR is better.  and easier to undo when you take back a move.  If you use
>>>add, you can get overflows and when you later subtract, you can have a problem.
>>>
>>
>>I do not really understand this point. If I subtract the same number, the
>>variant should be recovered the old number, event last overflow of addition. Is
>>it true?
>>
>
>
>If your hardware supports unsigned, add/subtract and xor are equivalent.
>if you don't have a native unsigned add/subtract, then you can add and
>subtract numbers and end up with a different value than you started with
>depending on how the overflow condition is handled...
>
>
>
>
>
>>>
>>>>
>>>>7) How to measure the "collision" (I little confuse about the dividend)? And
>>>>other measurements?
>>>
>>>If you do 64 bit signatures, they will be rare enough to ignore.  The only
>>>way to measure them is to store both the signature and a real compressed version
>>>of the board.  When signatures match but the real board does not, you just had
>>>a collision.  You don't do this except for testing, however...
>>>
>>
>>Sorry, but IMO, this is type 1 error. The definition of collision is type 2,
>>when 2 positions map onto the same entry, but different hash keys.
>>
>
>I don't know of anyone that defines collision as two different signatures
>mapping to the same table entry.  Because that is going to happen M/N times
>every search where M is total nodes searched and N is table size.  Most call
>a "collision" the event that happens when two _different_ board positions
>produce the same hash signature.  That will cause problems.
>
>In my books, the "type 2" you mention only is used when you do not store the
>_signature_ in the entry.  Then you are at the mercy of the contents of the
>entry you map to. Storing the signature eliminates any consequence of this
>kind of happening.
>
>
>
>
>
>
>>>
>>>>
>>>>Many thanks for any help and suggestion.
>>>>Pham
>>
>>Thank again everybody for your helps and information.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.