Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: hash collisions

Author: martin fierz
Date: 23:37:26 07/10/02
On July 11, 2002 at 01:28:54, Ricardo Gibert wrote:

>On July 11, 2002 at 01:09:24, martin fierz wrote:
>
>>On July 11, 2002 at 00:23:56, Ricardo Gibert wrote:
>>
>>>On July 10, 2002 at 23:44:40, martin fierz wrote:
>>>
>>>>On July 10, 2002 at 21:48:48, Ricardo Gibert wrote:
>>>>
>>>>>On July 10, 2002 at 21:31:24, martin fierz wrote:
>>>>>
>>>>>>On July 09, 2002 at 17:59:17, Robert Hyatt wrote:
>>>>>>
>>>>>>>I have been running some tests after prompting by Bruce, and the results
>>>>>>>have been interesting.
>>>>>>>
>>>>>>>The question posed by him was "how many hash collisions (signatures match
>>>>>>>but positions do not) can we tolerate in a search without having serious
>>>>>>>problems?"
>>>>>>>
>>>>>>>I did the following:  I made my HashProbe() routine use the node counter,
>>>>>>>and every N nodes, I would "match" a hash entry even if the signatures were
>>>>>>>not the same, simulating a collision.  I got all the way down to a collision
>>>>>>>every 1000 nodes without seeing even a score change at the root, which was
>>>>>>>surprising.
>>>>>>>
>>>>>>>I would like to ask others to try this as well.  It only adds a couple of
>>>>>>>lines to your HashProbe() code.  I started at one "collision" every 100K
>>>>>>>nodes, but that did nothing.  Then one every 10K.  And finally one every
>>>>>>>1K.  This seems surprising to me.  I ran several positions, with and without
>>>>>>>the "collision problem" and compared my logfile output.  Scores did not change
>>>>>>>at all.  I used tactical positions like Kopec22, opening positions, and even
>>>>>>>the classic fine #70.
>>>>>>>
>>>>>>>Seems that the search is far more resistant to such errors than anybody has
>>>>>>>thought previously.  It would be interesting to see if this is just a result
>>>>>>>of Crafty, or if it holds for others.  Particularly for someone that hashes
>>>>>>>in the q-search, which I do not do...  Note that I did not false match every
>>>>>>>N calls to HashProbe() rather I false matched on the next call to HashProbe()
>>>>>>>after having searched N nodes since the previous call.  Sometimes this would
>>>>>>>be less frequently than once per 1000 nodes obviously, since I could burn
>>>>>>>several thousand nodes in the q-search and not do another false match until I
>>>>>>>get back out of that mess and back into the normal non-qsearch part.
>>>>>>
>>>>>>i made a small experiment: my checkers program uses 2 32-bit numbers as hash
>>>>>>signature, the first is used to compute the table position with number%hashtable
>>>>>>size, the second is stored in the table for verification. i ran a test suite, 78
>>>>>>positions, each searched to depth 13 and depth 19. then i used only 10 bits of
>>>>>>the 32-bit verification number to verify and did this thing again. result: out
>>>>>>of 156 scores and moves, 3 scores were different, by a very small amount (0.02,
>>>>>>0.04 and 0.04) and the moves were the same, and in one instance the score was
>>>>>>the same but the move different. i checked and the 10-bit move was actually an
>>>>>>error. of course using 10 bits for verification purposes is ridiculously little.
>>>>>>i repeated the experiment using 20 bits, and now all scores and all moves were
>>>>>>identical, although the node counts sometimes were slightly different.
>>>>>>anybody want to claim 32 bits is too little for a hash signature? :-)
>>>>>>
>>>>>>
>>>>>>if my math doesnt let me down, this type of experiment actually should produce a
>>>>>>fixed error rate: if you use 10 bits, the probability that you have the same
>>>>>>number in your signature as the one in the table is 1/1024 (if you have a
>>>>>>different position of course...). so you should be seeing 1 error in 1024 with
>>>>>>this setup, if you have a single probe in the hashtable.
>>>>>>
>>>>>>implementation details: my program uses MTD(f) and has a bucket size of 2, i.e.
>>>>>>it always probes two consecutive hashtable entries. so maybe my error rate was
>>>>>>even higher than 1 in 1024 in the 10-bit-signature experiment.
>>>>>>
>>>>>>aloha
>>>>>>  martin
>>>>>
>>>>>This looks like a better test than RH's, but I think we are getting the cart
>>>>>before the horse. It would make more sense to test a plain vanilla alpha-beta
>>>>>program first before going on to tackle PVS and/or MTD(f).
>>>>
>>>>i'm not sure i agree. after all, i want to know how hash collisions could affect
>>>>my program. and that uses MTD. so i don't really care about what the result in
>>>>pure alpha-beta is. of course it is an interesting question if moving from
>>>>alpa-beta to PVS or MTD makes your program more or less sensitive to this kind
>>>>of error. but since i'm not a computer scientist, i won't spend lots of time
>>>>trying to find out about this :-)
>>>
>>>
>>>If you want to understand a thing, it is better to move from the simple to the
>>>more complex. Of course, it is more relevant in the practical sense to
>>>understand how it affects MTD(f) from your standpoint, but you will understand
>>>that more precisely by being more methodical.
>>>
>>>I think this is worthwhile, because I think there is a possible big payoff here
>>>in that studying this might give insight into a better hash table replacement
>>>strategy. Just my 2 cents.
>>
>>i dont understand both statements :-)
>>1) i don't think that doing either MTD or plain vanilla alpha-beta first is in
>>any way better. both algorithms interact with the hashtable in some complicated
>>manner, and in both cases, hash collisions at nodes far away from the root will
>>have a very good chance to be masked by the search. the only way to be more
>>methodical is to do vanilla alpha-beta, PVS, and MTD and to compare the results.
>
>That is what I'm suggesting and what is the most logical order (to me)?
>Alpha-beta first IMHO.
>
>>2) if you want to study hashtable replacement, why not test different hashtable
>>replacement schemes? i dons see how results of hash collisions have anything to
>>do with efficient replacement strategies.
>
>Yes, testing different hashtable schemes is fine, but which ones? I think the
>effect in question is suggestive of how some positions can have more of an
>impact on the search than others.  How that is precisely may suggest an improved
>hashtable replacement scheme. On the other hand, you may be right and it may
>not, but how do you know that without understanding exactly what is going on?

well, i definitely tested many different hashtable strategies. i used a single
table and a two-table approach, i used different bucket sizes, i used different
replacement criteria (always replace, only if it's more valuable etc.). i could
do all these tests just by running my program over a set of test positions and
looking how long it took to complete it (or how many nodes it had to search). i
don't see the need to understand what is going on to do this :-)
then again, if you really understand what is going on you can maybe figure out
some better approaches to replacement strategy.

aloha
  martin


>>
>>aloha
>>  martin
>>
>>
>>>
>>>
>>>>my personal opinion is that it would affect all searches in the same order of
>>>>magnitude. i'd be very surprised if it were different.
>>>>
>>>>>percentage of all continuations are sensitive to collision error. This
>>>>>percentage is miniscule enough to make even a 1 in 1000 chance of collision to
>>>>>be not so bad.
>>>>yes, but what about the situation where one of the child nodes of the current
>>>>position is "found" in the hashtable during the search with a wildly wrong
>>>>result? if that happens you are busted for sure.
>>>>
>>>>aloha
>>>> martin
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.