Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: About random numbers and hashing

Author: David Rasmussen

Date: 12:46:12 12/05/01

Go up one level in this thread


On December 05, 2001 at 11:45:48, Dieter Buerssner wrote:

>
>I agree with almost anything said. Some additional info. Many rand() only return
>15 bits (but not less, because 15 bits is required by the ISO-C Standard). Your
>lowest order period of 2 is the typical case for linear congruential generators
>(LCGs) with a modulus of a power of 2 (all literature suggests prime numbers
>...)

Because prime moduli yields cyclic groups.

>. I think, only some early BSD systems used that bad PRNG. The rand sources
>I know, and also others I don't know, but have tested for there properities,
>still often use LCGs, but typically use the 15 most significant bits of the
>output of a 31 bit LGC. So the period of the last bit should be 2^16. No serious
>simulation would use such a PRNG, but for hashing, it will be enough. You only
>need about 1000 numbers, which is much less then the period of the last bit,
>even when you call rand() 5 times (for 64 bits).
>
>>The best thing to do (I think) is to get a good 32-bit random number
>>generator in C (or whatever language you're using) so that you can get
>>repeatable results on different machines and with different compilers.
>
>I totally agree. Multiply with carry generators need about two lines of code
>(compared to one line for a LCG).
>
>Another curiosity, I want to mention. Often it seems to be stated here, that one
>can look for random numbers with some special properities. Like finding a
>sequence with a good Hamming distance in the cited text above. Such a sequence
>of numbers are not (pseudo) random numbers anymore, because it would not show
>the expected statistics for this properity!
>

Randomness is not important. Distance is. We want our ~1000 keys to have the
property that it takes "many" of them to get zero when xor'ed, in other words we
want the dimension of the algebraic code that these ~1000 make, to be as high as
possible. There exist linear codes with ~1000 codewords, with minimum hamming
distance way higher than the 16 in my codes and 14 in Crafty last time I
checked, and a high dimension also. I talked to my algebraic coding theory
professor about this, and I think the code he mentioned had minimum hamming
distance 23 or something like that. I don't remember the dimension. But such a
code is not very random at all. Still, there is no reason why it wouldn't work
for hashing. In fact, it would work great, and would be designed to have nice
properties with respect to hashing. That is, the probability of collision would
be way lower than for our random numbers.

>As Martin, I also found absolutly no practical difference by using different
>PRNGs (from simple ones to combining some of the most state of art ones).
>

/David



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.