Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Hashtables: is larger always better?

Author: Robert Hyatt
Date: 17:06:26 09/25/01
On September 25, 2001 at 18:20:29, Sune Fischer wrote:

>On September 25, 2001 at 14:21:32, Robert Hyatt wrote:
>
>>>Using my example with 2^64 unique nodes calculated per move and a 2^64 size
>>>table, you will fill this table on the first move (assuming no collisions). If
>>>you have an average branch factor of 6, then this is less than 30 plies deep. So
>>>when you get to positions deeper than 30 plies, you will be seeing only new
>>>positions, and since your table is full with outdated nodes, they will all
>>>collide in the hash.
>>
>>no... that is simply an incorrect assumption.  You are assuming that we search
>>until we fill 2^64 entries, which is going to take way beyond 2^64 nodes in
>>the search due to transpositions.
>
>Yes true, it will take some time to get it full, then let's say 5 moves, it
>doesn't change anything.

Ok.. reality check.  How long to search 2^64 _unique_ positions?  That has to
be X total positions where X >> 2^64.  at 4 billion nodes per second, something
we won't see for many years, that takes 4 billion seconds.  Or 120 years.  That
is to search just 2^64 total positions.  2^64 unique positions is going to take
some constant N times that.  At least a factor of 100.  Probably a factor of
millions.

That table isn't going to fill up soon.

second, search is localized in the groups of 2^64 nodes that will collide with
each other.  The search space is not spread over the entire 2^160 positions
at any one time, but is limited to a very small sub-set of that total.  Which
reduces the collision opportunities significantly.



>
>>Then we somehow jump over to another part
>>of the search that _only_ produces positions that match in 64 bits, but not
>>in the full 160.  That is very unlikely to happen.
>
>Not unlikely, it will happen if you had the power.

OK... But the original question was "is bigger hash always better".  If you
want to say "For the next 500K years, yes.  Beyond that, no" then I will
agree.  But to say "no today" I don't agree at all....




>
>>  The game progresses
>>naturally forward and _most_ probes into a space containing a full 2^64
>>nodes are going to be _real_ matches, not false ones.
>
>In the beginning yes, but not further down the road.

Got to be the same there too.  Some will be overwritten.  With the _right_
stuff, rather than false matchine with wrong stuff...



>
>>It is more likely that
>>as you search, old things _will_ be overwritten by new things, due to the way
>>I "age" the hash entries.
>
>How you do things in Crafty is a completely different matter.

Why?  The question is still the same.  "Is bigger always better?"  I don't
really care about inefficient implementations.  Or faulty implementations...




>
>>When the table is full, _every_ probe is not going
>>to be a false match.  Almost 100% of them will be true matches with a few false
>>matches thrown in as new positions are searched which match in 64 but not in
>>160.
>>
>>This logic that says a larger hash table is worse is simply defective.  There is a distinct gain as the hash table is made larger.  I simply claim that this gain _always_ overrules any possible small increase in collisions.  _always_.
>
>Then you did not understand my example.
>
>>Which means, simply stated, "bigger is _always_ better, or at least it is
>>_never_ any worse than a smaller table.  Not always better because once the
>>table is big enough to hold everything you can search, more won't help."
>
>With current kNs and 64 bit keys - then yes.
>But generally the answer is no, smaller keys or more power would increase
>collision rate to unacceptable leves with too large a table.

Yes, but not for thousands of years into the future.  When are we _really_
going to be able to search 16,000,000,000,000,000,000 nodes?  (4 billion times
4 billion)???  Today making the table larger does _not_ appreciably increase
the chance for collisions.  It _does_ increase the search efficiency however.
The net is a _gain_...



>
>
>>>If you "only" use a 2^32 size table, you will have a 50% chance for each of the
>>>2^64 nodes to collide, so that's 2^32 collisions per move.
>>>This continues all the way down to 1, where there will be only 1 collision on
>>>average per move.
>>
>>Again assuming that the search is now only searching _new_ positions.  This is
>>not a good assumption.  By the time the table is that full, most probes will
>>be hits or misses, rather than simply collisions.
>
>Ups, typo here, should read 2^63 of cause.
>It is a good assumtion, the game progresses and new positions never seen before
>will emerge. Eventually the table will fill up.
>
>>>>going from 2^20 entries to 2^21 entries does not change this enough to even
>>>>think about.
>>>
>>>It changes by a factor of 2, I'm not saying that it is a problem,
>>>I'm merely saying that it is a mathematical fact.
>>>(I have made the assumtion that you fill the table quickly, because these
>>>probabilities are incorrect while there are still avalible space in the hash.)
>>>
>>
>>if the table could be that big, then for each iteration, exactly 1/2 of
>>the total nodes searched were the same nodes that were searched the last
>>iteration, with a new "layer" of endpoints added on.  But then, these "new"
>>positions are very likely not new at all, but simply transpositions to positions
>>already searched.
>>
>>You are making an assumption that isn't exactly correct.  That as we search,
>>we are only traversing _new_ positions that have not been seen before.  That is
>>generally wrong.  It would not be hard to compute this statistic for reasonable
>>sized hash tables using a real engine and real search.
>
>As the table gets bigger we have more of everything, more good hits and more
>collisions. I don't understand why you keep fighting it.

Because of the math.  I won't start to see lots of collisions until I begin
to search enough to overlap into the next 2^64 segment of the 2^160 search
space.  That is a _long_ way from happening today...

I can barely search 2^32 positions today, given an hour to do it.  To search
2^64 today takes 4 billion hours.  That is 500,000 years...  I don't worry about
such numbers...




>The assumtions I made was to prevent people saying things like: "... but then
>what if...".
>With more hits (e.g. transpostitions) you will go deeper in your search, but you
>still can't solve chess, the base tree will still have limited depth.
>
>-S.


If you want to go theoretical and talk about solving the game, then yes, 2^64
is not big enough. The keys must be close to 2^160.  Close enough that the
collisions won't change the root score and screw things up.

But for the question "is a bigger hash table always better?" today the answer
is a simple "yes, so long as you don't make it so big that you start to page
and slow down due to disk I/O."  Collisions don't play any role in that decision
whatsoever today...  And won't for the next few thousand centuries...
Re: Hashtables: is larger always better? Sune Fischer 03:17:38 09/26/01
- Re: Hashtables: is larger always better? Robert Hyatt 07:00:46 09/26/01
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.