Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Non power of two hash table sizes

Author: Dann Corbit

Date: 15:58:54 02/18/02

Go up one level in this thread


On February 18, 2002 at 17:37:27, Heiner Marxen wrote:

>On February 18, 2002 at 17:07:11, Dann Corbit wrote:
>
>>On February 18, 2002 at 16:56:37, Benny Antonsson wrote:
>>
>>>Why must it be a prime number ?
>>
>>When a composite number is chosen, factors in the number can cause a bad
>>[non-uniform] distribution.
>>
>>See Knuth's "The Art of Computer Programming", Volume 3 'Sorting and Searching'
>>page 516 for a detailed explanation.
>
>IMO this is only an issue if you start with a non-uniform distribution.
>If your hash key is already very good (Zobrist should be good enough),
>you may take any hash table size and do
>
>   hash_index = hash_key % hash_table_size;
>
>(After all, taking the low N bits is just an efficient way to do key % (1<<N))
>
>In doubt (probably anyhow) just measure the quality of the distribution.
>It is of high quality, it behaves similar to a true random distribution.
>Many years ago I've done such experiments, but I do not remember the details
>any more, sorry.

It's probably not much of a worry if you have really good random number
generators and use a sophisticated hash.  It should be noted (however) that many
C distributions have a particulary onerous rand() function (and the one in the
POSIX documentation is just plain *bad*).  From the C FAQ:

13.15:  I need a random number generator.

A:      The Standard C library has one: rand().  The implementation on
        your system may not be perfect, but writing a better one isn't
        necessarily easy, either.

        If you do find yourself needing to implement your own random
        number generator, there is plenty of literature out there; see
        the References.  There are also any number of packages on the
        net: look for r250, RANLIB, and FSULTRA (see question 18.16).

        References: K&R2 Sec. 2.7 p. 46, Sec. 7.8.7 p. 168; ISO
        Sec. 7.10.2.1; H&S Sec. 17.7 p. 393; PCS Sec. 11 p. 172; Knuth
        Vol. 2 Chap. 3 pp. 1-177; Park and Miller, "Random Number
        Generators: Good Ones are Hard to Find".

13.16:  How can I get random integers in a certain range?

A:      The obvious way,

                rand() % N              /* POOR */

        (which tries to return numbers from 0 to N-1) is poor, because
        the low-order bits of many random number generators are
        distressingly *non*-random.  (See question 13.18.)  A better
        method is something like

                (int)((double)rand() / ((double)RAND_MAX + 1) * N)

        If you're worried about using floating point, you could use

                rand() / (RAND_MAX / N + 1)

        Both methods obviously require knowing RAND_MAX (which ANSI
        #defines in <stdlib.h>), and assume that N is much less than
        RAND_MAX.

        (Note, by the way, that RAND_MAX is a *constant* telling you
        what the fixed range of the C library rand() function is.  You
        cannot set RAND_MAX to some other value, and there is no way of
        requesting that rand() return numbers in some other range.)

        If you're starting with a random number generator which returns
        floating-point values between 0 and 1, all you have to do to get
        integers from 0 to N-1 is multiply the output of that generator
        by N.

        References: K&R2 Sec. 7.8.7 p. 168; PCS Sec. 11 p. 172.

13.18:  I need a random true/false value, so I'm just taking rand() % 2,
        but it's alternating 0, 1, 0, 1, 0...

A:      Poor pseudorandom number generators (such as the ones
        unfortunately supplied with some systems) are not very random in
        the low-order bits.  Try using the higher-order bits: see
        question 13.16.

        References: Knuth Sec. 3.2.1.1 pp. 12-14.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.