Author: Robert Hyatt
Date: 08:36:09 07/11/02
Go up one level in this thread
On July 10, 2002 at 23:58:26, martin fierz wrote: >On July 10, 2002 at 22:01:23, Robert Hyatt wrote: > >>On July 10, 2002 at 21:48:48, Ricardo Gibert wrote: >> >>>On July 10, 2002 at 21:31:24, martin fierz wrote: >>> >>>>On July 09, 2002 at 17:59:17, Robert Hyatt wrote: >>>> >>>>>I have been running some tests after prompting by Bruce, and the results >>>>>have been interesting. >>>>> >>>>>The question posed by him was "how many hash collisions (signatures match >>>>>but positions do not) can we tolerate in a search without having serious >>>>>problems?" >>>>> >>>>>I did the following: I made my HashProbe() routine use the node counter, >>>>>and every N nodes, I would "match" a hash entry even if the signatures were >>>>>not the same, simulating a collision. I got all the way down to a collision >>>>>every 1000 nodes without seeing even a score change at the root, which was >>>>>surprising. >>>>> >>>>>I would like to ask others to try this as well. It only adds a couple of >>>>>lines to your HashProbe() code. I started at one "collision" every 100K >>>>>nodes, but that did nothing. Then one every 10K. And finally one every >>>>>1K. This seems surprising to me. I ran several positions, with and without >>>>>the "collision problem" and compared my logfile output. Scores did not change >>>>>at all. I used tactical positions like Kopec22, opening positions, and even >>>>>the classic fine #70. >>>>> >>>>>Seems that the search is far more resistant to such errors than anybody has >>>>>thought previously. It would be interesting to see if this is just a result >>>>>of Crafty, or if it holds for others. Particularly for someone that hashes >>>>>in the q-search, which I do not do... Note that I did not false match every >>>>>N calls to HashProbe() rather I false matched on the next call to HashProbe() >>>>>after having searched N nodes since the previous call. Sometimes this would >>>>>be less frequently than once per 1000 nodes obviously, since I could burn >>>>>several thousand nodes in the q-search and not do another false match until I >>>>>get back out of that mess and back into the normal non-qsearch part. >>>> >>>>i made a small experiment: my checkers program uses 2 32-bit numbers as hash >>>>signature, the first is used to compute the table position with number%hashtable >>>>size, the second is stored in the table for verification. i ran a test suite, 78 >>>>positions, each searched to depth 13 and depth 19. then i used only 10 bits of >>>>the 32-bit verification number to verify and did this thing again. result: out >>>>of 156 scores and moves, 3 scores were different, by a very small amount (0.02, >>>>0.04 and 0.04) and the moves were the same, and in one instance the score was >>>>the same but the move different. i checked and the 10-bit move was actually an >>>>error. of course using 10 bits for verification purposes is ridiculously little. >>>>i repeated the experiment using 20 bits, and now all scores and all moves were >>>>identical, although the node counts sometimes were slightly different. >>>>anybody want to claim 32 bits is too little for a hash signature? :-) >>>> >>>> >>>>if my math doesnt let me down, this type of experiment actually should produce a >>>>fixed error rate: if you use 10 bits, the probability that you have the same >>>>number in your signature as the one in the table is 1/1024 (if you have a >>>>different position of course...). so you should be seeing 1 error in 1024 with >>>>this setup, if you have a single probe in the hashtable. >>>> >>>>implementation details: my program uses MTD(f) and has a bucket size of 2, i.e. >>>>it always probes two consecutive hashtable entries. so maybe my error rate was >>>>even higher than 1 in 1024 in the 10-bit-signature experiment. >>>> >>>>aloha >>>> martin >>> >>>This looks like a better test than RH's, but I think we are getting the cart >>>before the horse. >> >>First, the definition of "better" is important. >> >>My last test was one error ever 1,000 positions. _very_ close to his >>error rate of one ever 1024 positions. >> >>However, his has "softer" errors than my test. IE at least his hash >>signature matches in _some_ bits, meaning that it is possible that the >>positions are fairly close. Mine matched in no bits, so that the error >>value could have come from a wildly different position. > >i don't believe this "softer" thing. just because two hash signatures = random >numbers match in the last 10 bits should mean *nothing* at all about how close >the two positions are on the board. besides, your numbers surely did not match >in no bits, because you were producing a collision between two random numbers x >and y - and it would be a very big coincidence indeed if x==~y ! so your numbers >should have matched on about half the bits - of course again meaning nothing at >all :-) Perhaps we are discussing two different things. When I match a hash signature, I do the following: if current_signature == hash_signature Someone suggested that I change it to this: if (current_signature&1023) == (hash_signature&1023) That _does_ produce "softer errors". Because on occasion, when the two signatures match in the low 10 bits, they match in _all_ 64 bits. Hence "soft errors". My approach didn't necessarily match in _any_ bits at all and was guaranteed to be wrong every time, because I first tested for a good match, if none, then I checked the counter and if it was large enough I forced a match. On a guaranteed-to-be-wrong hash signature. > >>> It would make more sense to test a plain vanilla alpha-beta >>>program first before going on to tackle PVS and/or MTD(f). Also, if what I think >>>is going on is correct, then any kind of forward pruning should affect the >>>results somewhat. The variables complicating the result ought to be minimized. >>> >>>What I expect is going on is alpha-beta examines mostly really crappy >>>continuations and as one might expect with really crappy continuations, there is >>>more than one way to refute them at more than one point in a continuation. This >>>is a type of redundancy that mitigates the errors. I believe only a miniscule >>>percentage of all continuations are sensitive to collision error. This >>>percentage is miniscule enough to make even a 1 in 1000 chance of collision to >>>be not so bad. >> >> >>Apparently so, although more testing is needed. And the real question is >>"how many false matches before you are into a high-probability of seeing a >>real error in a game? That is not answered yet at all. Just because we >>produce the same move/score for a few test positions, doesn't mean that the >>game won't have a different move or score, which _could_ be a problem. > >that's right of course. 1 different move in 78 positions is too little. but if >you do this test with e.g. 1000 positions and find 10 where you play a different >move, you have a reasonable estimate of 1 different move per 100 moves played by >your program. if i were to go to less bits to get to say 8 errors in 78 >positions that would also be reliable enough for 10% errors. > >>I don't personally feel comfortable with any kind of error without knowing >>how it is going to influence actual games. IE if it is going to produce one >>wrong move in 10 games, that is enough to lose a game against a GM, and losing >>one of 10 to an error is not acceptable to me, period. >agreed. find yourself enough test positions and do the experiment. if you have >enough positions you can plot the rate of "% different moves" to bits in hash >signature and extrapolate to whatever you are really using. in my test, i only >have 78 positions so i can't go to small % different moves. > >i suppose that the single hash-collision frequency should go down by 2^12 with >32 bits instead of 20, where my program already played 78 moves without error. >now if the single hash-collision frequency is proportional to the error >probability at the root, then 32 bit is pretty safe :-) > >aloha > martin So long as you are willing to accept that "error rate" of course. I'm not sure I will accept even one in 1000 wrong scores.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.