Computer Chess Club Archives


Search

Terms

Messages

Subject: Technique for measuring hashing error rate

Author: Dan Newman

Date: 21:46:18 10/10/97


Hi all,

I've read with great interest various speculations about (and
calculations of) hashing error rates.  Ed Schröder mentioned
that he'd done some tests which involved storing the whole
position, but that there was a doubling of hash table size and
a huge 20-30x performance hit--which I suppose is large enough
to prevent accurate measurement in a reasonable amount of time.
What follows is a technique that should accurately measure the
rate with a nearly insignificant hit on performance.  (I hope
this isn't well known--I haven't seen it mentioned anywhere
though).

1) Let's assume the position hash code is produced as two 32-bit
   values using the usual Zobrist scheme (and that all the bits
   in these words are "good").  One of these values is stored
   in the hash table entry as a verification key, the other is
   trimmed down (masked) to say 20 bits for indexing the hash
   table (1 M entries).

2) Room is made in the hash table entry (say 8-bits--it could
   easily be smaller) to store a secondary "error detection"
   key.  Eight bits of the unused portion of the hash code are
   stored here (I've always just thrown the remaining portion
   out--giving me a 52-bit signature for a 1 M entry hash
   table).

3) Now, whenever you get a successful hash table hit (i.e., the
   current position's key matches the hash table entry's key
   and all other verification that you normally do succeeds)
   you also compare the error detection keys.  If these keys
   don't match, you record a hashing error.  If they do match,
   you don't know whether there's an error or not.

If you use an 8-bit error detection key, there should only be
approximately one in 256 hashing errors that go undetected by
this scheme.  (If the error rate is large enough you can even
correct it by adding on another 1/256th. :))

There are lots of variations to this scheme: you can test
64-bit hash codes by generating yet another (independent) 8-bit
(or whatever) hash code and using it for the error detection
key.  For testing smaller hash codes one can simply mask off a
few bits from the key when making the usual verification check
and use those few bits as the error detection key:

    instead of

        if( current_key == hashtable_key ) {...}

    do something like this:

        unsigned long test = current_key ^ hashtable_key;
        if( (test & 0x00ffffff) == 0 ) {
            if( test != 0 ) hash_error_count++;
            [...]
        }

    where the ellipsis is for whatever you do when you
    get a hit.

Now, the question is, how should the error rate be expressed,
as a percent of hash table accesses, percent of hits, or
errors per second, or what?--I lean towards the errors/second
measure since it gives you an easy way to calculate the
number of games that are potentially affected.

-Dan.

PS, I just tried this scheme in a old chess engine that I
wrote (the current one doesn't have the transposition table
stuff in it yet)--the error rate was unmeasurable with just
a few minutes of search time (45 bits of hash code, 8-bit
error key).  I lowered the number of bits in the hash code
(index+key) to 29 and got an error rate of about 600-900
errors/second while playing a few moves starting in a couple
of different positions.  I upped the bits to 33 and got
between zero and 8 errors/s.  37 bits gives between zero and
one errors/s.  With 41 bits the error rate was too low for
me to get a good measurement (only 3 errors over a few
minutes).  The above was all at about 70-120 knps with an
always replace replacement scheme.

If the error rate for 29 bits is about 750/s say, and every
bit we add to the hash code lowers this by a factor of two
(this seems correct to me) then for 53 bits (my usual) I
should get 4.47e-5 errors/s or .16 errors/hour--ugh, maybe
I should put those extra bits into the hash table entry,
bump the hash code up to 64 bits (~.7 errors/year).

-Dan.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.