Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hashtables: is larger always better?

Author: Robert Hyatt

Date: 07:12:31 09/25/01

Go up one level in this thread


On September 25, 2001 at 04:11:11, Sune Fischer wrote:

>On September 24, 2001 at 22:43:47, Robert Hyatt wrote:
>
>>>Today the size of the hash tables on the best hardware can be at least
>>>512 Mbytes that mean 2^29 bytes and 2^26 64 bits numbers.
>>>
>>>2^32 nodes per hour is also possible for Deep Fritz on good machine and it means that 2^38 nodes per 64 hours is possible.
>>
>>I don't think that is the issue.  There are roughly 2^160 chess positions.
>>We are hashing that into a 2^64 address space.  That gives 2^96 different
>>chess positions for every hash position.
>>
>>_that_ is the collision problem.
>>
>>I simply answered the question in an expermental way as I explained to Uri
>>in a post right above this one.
>>The answer suggested that 64 hours is not
>>going to cause a problem.  The question is "how many false matches does it
>>take to screw up a tree of size (say) 100M nodes?"  I experimentally answered
>>that one is not enough.  And I also experimentally found that at 1M nodes per
>>second, I was getting less than one false match per hour, which is about 4B
>>nodes.  So I got less than 1 collision every 4B nodes, and I ran a test that
>>proved that one false match ever 100M nodes won't kill the search.  I conclude
>>that things are safe for a long time.
>
>I understood the question as: "is larger _always_ better".
>The answer would be no.
>Because a hash table of size 2^64 and a computer that could do 2^64 nodes per
>move would fill this in no time and the rest of the game after ply ~30 would be
>_all_ collisions!


No it wouldn't.  Remember that there are roughly 2^160 different unique
chess positions.  We can store up to 2^64 of those.  Leaving 2^96 possible
collisions for every possible entry in the 2^64 table.  It doesn't matter
whether you store 2^32 entries or 2^64 entries.  The potential collisions
are both so large it doesn't matter.  And the only undetected collisions are
going to be the cases where you search two of those 2^96 different positions
that map to the same table entry.  And it won't matter whether you have just
one table entry or 2^64.

going from 2^20 entries to 2^21 entries does not change this enough to even
think about.  We would just be depending on a bit of luck to cause the first
of the two positions to be overwritten before we probe and false-match on the
second.  That is yet another degree of freedom.  And it gets dwarfed by that
huge collision predictor (2^96 positions for every single table entry)
anyway...



>
>The reality is quite different, but we still have an almost linear probability
>increase of collisions with hash size, eventhough the number is still very small
>for huge tables.
>I believe you stated otherwise at some point, which was why i responded.
>
>Besides, to me those are two very different questions (theory and reality), and
>a wise man once said: don't you be mixing apples and oranges ;)


OK...  In theory, it _may_ be slightly worse to have a larger table.  In reality
I believe the gains in search efficiency _far_ outweight any nominal increase in
the chance for a collision.  that 2^96 is a very large number and can't be
overcome unless we go to a larger than 64 bit signature.  And it won't be
eliminated until we get to a 160 bit signature.


>
>
>>>Here my estimate is even bigger amd I estimate almost 1 hash collision per move
>>>if you use a good hardware for 64 hours per move.
>>
>>That would be great.  One collision every 64 hours is _not_ going to influence
>>anything at all...
>>
>>But even one ever 100M nodes is not harmful..
>>
>>
>>
>>>
>>>I do not say that this almost 1 hash collision per move is important and it is
>>>possible that it causes problem only in 1/100000 of the cases but I do not know
>>>a good estimate for it.
>>
>>I ran a test with an error every 100M nodes, across about 100 test positions,
>>and found zero changes at the root.  No score changes, and absolutely no move/PV
>>changes.  We could run that kind of test again, but be prepared to burn months
>>of computer time.  I did.  :)
>>
>>At one time I had 4 Crays smoking along until someone asked me "what in the
>>hell is this stuff you are running up here?"  :)
>>
>>
>>
>>>"
>
>-S.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.