Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hashtables: is larger always better?

Author: Uri Blass

Date: 15:44:04 09/24/01

Go up one level in this thread


On September 24, 2001 at 17:55:57, Robert Hyatt wrote:

>On September 24, 2001 at 16:46:12, Antonio Dieguez wrote:
>
>>On September 24, 2001 at 16:32:35, Robert Hyatt wrote:
>>
>>>On September 24, 2001 at 16:24:13, Antonio Dieguez wrote:
>>>
>>>>On September 24, 2001 at 16:17:14, Robert Hyatt wrote:
>>>>
>>>>>On September 24, 2001 at 15:41:07, Antonio Dieguez wrote:
>>>>>
>>>>>>On September 24, 2001 at 15:27:35, Robert Hyatt wrote:
>>>>>>
>>>>>>>On September 24, 2001 at 13:53:58, Antonio Dieguez wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>>Several hash into 2 X 32 bit values.  You store one value, you use the other
>>>>>>>>>to generate the hash index.  This is not quite as safe as a true 64 bit hash
>>>>>>>>>signature where all 64 bits are used, but it is pretty good.  If you have
>>>>>>>>>one million entries in the table, your hash key is 52 bits long, effectively,
>>>>>>>>>which is not horrible.  Not as good as 64, but not horrible.
>>>>>>>>
>>>>>>>>hi. isn't one million of entries around 2^20, so just 44 bits are used for the
>>>>>>>>key, (not 52) ?
>>>>>>>
>>>>>>>I am assuming that the hash signature is made up of two 32-bit words.  One of
>>>>>>>the 32 bit words is stored in the hash entry.  The other is used to generate
>>>>>>>the index.  That gives 32 + 20 == 52 bits used if you have a 1M entry table.
>>>>>>
>>>>>>yep, sorry.
>>>>>>
>>>>>>>>what I see is that 48 bits with separate hashindex is already safer than 64 bits
>>>>>>>>without separate index when using just 131072 entries (=47 bits), so I may be
>>>>>>>>not understanding something.
>>>>>>>
>>>>>>>You aren't really using 48 bits.  You are using more.  You are using the number
>>>>>>>of bits you store in an entry + the number of bits you use to produce the table
>>>>>>>index.  In your case 65 (48 + 17).
>>>>>>
>>>>>>I'm hashing just 48 bits so what do I lose? only a few cycles. And what do I
>>>>>>win? that I don't make 100 times more possible a collision if I increase the
>>>>>>hashtable 100 times.
>>>>>
>>>>>Increasing the hash table 100 times does not make the probability of a collision
>>>>>100 times greater.
>>>>
>>>>well, see my other post to Heiner and tell me where I am wrong.
>>>>>
>>>>>64 bits has been proven to do pretty well in today's chess engines.  John
>>>>>Stanback, myself, and a couple of others tested this several years ago in a
>>>>>thread on R.G.C.C, and with 64 bits we didn't have enough collisions to think
>>>>>about, while with 32 bits the collisions were overwhelming.
>>>>
>>>>you said it, "several years ago"
>>>
>>>2-3-4 years ago.  But doing searches with _billions_ of nodes to check for
>>>collisions.  We are nowhere near doing searches that big today, even on fast
>>>SMP hardware.  I let at least one test grind on a Cray for weeks, to see how
>>>many collisions I found.  It was so small as to be ignorable.  I think it was
>>>maybe 2-3 collisions for every 8 hours of 1M nps searching or some such thing...
>>
>>ok thanks for sharing the data.
>>64 bits is enough for a long time.
>>but with more and more ram a colision is more and more possible too, just as it
>>were speed (when using part of the bits for the index) don't negate that.
>
>
>Note that I don't sign on to the theory that more ram increases the probability
>of a collision.  The probability of a collision is 1 / 2^64 for a normal
>position.  This means for a table that holds 2^64 entries in a 2^64 signature
>space.  Smaller RAM improves that by overwriting/replacing entires before they
>can collide.  But the base 1/2^64 is good enough for me.  That is a very tiny
>number...
>
>Since we will never see memories that large, the probabilities are going to be
>good enough for a long time, assuming that 1 collision every few billion nodes
>is acceptable...  So far that seems to be ok...

If the probability is really 1/2^64 for every new node then you are not getting
one hash collison for every few billion nodes.

In this case you get one hash collision for every 2^64 nodes that is clearly
more than few billions and simple math shows that the total number of nodes that
deeper blue can search in 100 years  is smaller than 2^64(I assume that Deeper
blue can search 2^30 nodes per second for this discusion).

In 100 years you have 3600*24*365*100<2^32 seconds.

The fact that you get practically hash collisions in rare cases prove that the
probability is bigger than 1/2^64

I do not say that the probabilities are not going to be enough for a long time
and I tend to believe that there is not going to be a problem in the next 10
years but I need more data to be sure about it and it is the probability to play
a different move because of one hash collision at a random node.

Maybe somebody can try to simulate hash collision by changing a score of only
one random node in the tree and giving information for the probability to get a
different move as a function of the size of the tree.

I believe that you are going to find that a single hash collision is less
dangerous when the size of the tree becomes bigger.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.