Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: About random numbers and hashing

Author: David Rasmussen

Date: 03:57:28 12/09/01

On December 08, 2001 at 12:25:59, Sven Reichard wrote:

>Just a few more comments; sorry I've not been around for a couple of days.
>
>The idea that "just a couple of bits have to change" to get from one vector to
>another is important in error correcting codes, since there we deal with a
>symmetric binary channel where during transmission each bit has an equal
>probability of being flipped due to noise. Hence in this case, the Hamming
>distance is *the* quality criterion if we can decode the received code by
>finding the "closest" code word.
>However I hope that my data bus does not represent such a channel; i.e., there
>is no noise in my system.

I hope it doesn't either but that has nothing to do with this discussion.

>About linear independence: If you pick 768 random vectors of length 64, then
>they span the whole space with probability close to 1 (I haven't exactly
>computed that, and I don't intend to do it.)

Sure, but that's not the only thing that is important. We want every vector be
the linear combination of as many basis vectors as possible. If you imagine it
in three dimensions with a plain orthonormal basis, we don't want the vectors to
lie close to the axes. We want them scattered equally about.

>About portability of RNG's: As others have pointed out before, the easiest way
>to achieve portability is to generate the codes a priori, store them in a file,
>and include them in the distribution. Even using only one compiler, the RNG
>could be changed in the next version, making your opening book worthless.
>

Sure, or having your own PRNG and use a fixed seed. That's what I do, and it
eleminates the need for a file with numbers etc.

>David, could you once more point out which of the following statements you
>disagree with?
>a) We want to avoid different positions getting the same code.

Sure.

>b) We especially don't want that to happen close to the root.

Well, sure, but for me it is unacceptable to have it happening at all, if it
means a collision (which it might not necesarily).

>c) Hence, we want positions that are only a few moves apart to have different
>codes.

Sure.

>d) That means that we want small sets of hash codes to be linearly independent.

Yep. Actually, we want as big sets as possible to be linearly independent.
Define the number n to be the size of the smallest subset of vectors that are
linearly dependant. We want n to be as big as possible. But at the same time we
want the vectors to spread equally under the hashing function. My claim is that
we can influence n.

>e) Linear independence isn't affected by linear transformations.

Sure.

>f) The Hamming distance does depend on the choice of the vector space base.

I disagree, but maybe I don't understand what you mean. Hamming distance does
not depend on the choice of the vector space base.

>g) Given a set of vectors with minimal weight and minimal distance 17, there is
>a linear transformation moving it to a set of minimal weight 1 and minimal
>distance 2 (which in terms of linear independence is just as good as the one we
>started with).

You lost me there.

>h) Given all of the above, the Hamming distance appears to be totally unrelated
>to the quality of our set.
>

I don't agree with that conclusion (obviously), and I certainly don't understand
how you've reached it here.

>As I said before, if your Hamming criterion works better than pure random
>numbers, that's fine with me; I would just like to understand the reason why.
>Always the inquisitive mind...
>

I also would like to know _exactly_ why it works better. As I have said before,
no one really knows the exact full set of criterions we want for our vectors.

>Actually, the problem of hash codes is a little bit more complicated, since we

Exactly.

>also want to avoid minor collisions, i.e., different positions occupying the
>same spot in the transposition table. Since usually the first 20 bits of the
>hash key determine that position, we would like to avoid linear independence
>also for small sets of these truncated vectors. Have you tried applying the
>Hamming criterion to these leading bits?
>

No, but you are right. I have been thinking the same thing. And of other
criterions as well.

>Enough for today :)
>Sven

/David

Re: About random numbers and hashing Sven Reichard 06:40:40 12/10/01
Re: About random numbers and hashing Sune Fischer 04:31:48 12/09/01
- Re: About random numbers and hashing David Rasmussen 06:14:10 12/09/01

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.