Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Data structures question for hash tables

Author: KarinsDad

Date: 23:41:09 01/22/99

Go up one level in this thread


On January 22, 1999 at 22:24:57, Robert Hyatt wrote:

[snip]
>
>
>If you _insist_ on finding an empty position in the table, then back to Knuth's
>The art of programming again...  Your only choice is is multiple probes.  And
>you really don't want to probe successive addresses now since you are committed
>to storing if at all possible... even though we are going to violate cache
>guidelines, because you are going to run into huge 'chaining' problems.  I
>recommend you use the same thing we used in Cray Blitz (from Knuth of course)
>and just use 'other' bits from the hash signature as a random (but constant for
>a given probe) offset.  If you want to really get cute, you should take this
>random offset, and round it to the nearest 'near-prime'.  In this case, 'near-
>prime' is a number that is 'prime' relative to the size of your hash table.
>
>You do this to be sure that you touch 'every' table entry one time before you
>touch any table entry twice.  If you don't do this, you obviously run the risk
>of not trying every entry, which means you might not find a suitable position
>to overwrite...
>
>let me know if I wasn't clear enough...

Seemed clear enough. But I've realized that I do not need to hit every hash
entry. For the most part, I'll do that anyway (see below).

I think for the time being that I have settled on the following:

Have a 16K hash "array" of 4 byte pointers (i.e. 64K structure).
Within each position, have two pointers for a double link and a signature.

Use any of the hashing techniques which gives a somewhat random distribution
(like Zorbrist). If there is a null in the hash pointer table, create a new node
and put a pointer to it at that location in the hash pointer table. If not, go
to that position. If the signature equals that position, a transposition has
occured. If not, go to the less than pointer if the signature is less than this
position's signature and go to the greater than pointer if the signature is
greater than this position's signature. If the pointer is null, dupe what I did
at the hash pointer table, if not, dupe what I did in the position pointed to by
the pointer.

This would give me a binary tree under my hash pointer table. Not the greatest
of structures, however, I have a fairly large node at the moment. If I have a
good random distribution in the hash (the very question this thread was
attempting to answer), the tree should be fairly balanced.

The problems I see are:

1) I have to prune the tree a lot. I'm hoping to do this by not adding all
children of a given position within my search engine to the hash table. On an
alpha cutoff, I have to search them all, but I only put (at most) the best 3
into the hash table (I do not yet want to try just the best one). On a beta
cutoff, I can just put the one node in. On non-quiescent searches, I think I'm
stuck and have to put them all in, at least intially.

This paragraph may have been confusing to most of you since my search engine is
drastically different than yours. Trust me that this is correct even though it
sounds wrong.

2) I'll never be in cache.

3) My nodes are currently 256 bytes (this will be dropping, but until certain
code gets written, they are huge). Hence, if I can search 20000 nodes per second
* 10 minutes of searching (5 minutes each side) * 60 seconds per minute * 256
bytes per node * 1 node in 4 dupes (i.e. 75% of nodes are transpositions once
you reach 4 or more ply) = 750MB. If I could get 60MB for positions, that means
that I would have to recycle 87% of the nodes. At 36 moves per position on
average, that means that I would need to at most put in 1 node per 12 or 3
nodes. This works out for the alpha case, but the beta case will be something
like 1 beta found in 3 nodes searched, so betas will fill up the hash table and
force me to do further pruning (i.e. overwriting), as will non-quiescent moves.

4) With perfect distibution, 60MB / 16K positions = 3750 positions per hash
entry or 11 nodes deep per tree. This would result in an average of 5.5 probes
per search, however, the distribution will not be perfect, so I'll probably be
at an average of 7 probes. I'll probably have to up my hash pointer table to 1MB
(or greater) or 256K pointers. This would drop my average down to 5 probes (i.e.
7 deep in a perfect distribution) at the cost of using up a lot of memory that
could be used for more nodes (hence, more pruning).

I read what people said about not using linked lists, but my positions (nodes)
are so large at the moment that this will probably be as good as anything else
for now. And of course, once I decrease the size of my nodes, the larger I make
my hash table, the fewer number of probes.

This is probably fair at best, but easy to code and useable for now.

KarinsDad



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.