Author: martin fierz
Date: 20:58:26 07/10/02
Go up one level in this thread
On July 10, 2002 at 22:01:23, Robert Hyatt wrote: >On July 10, 2002 at 21:48:48, Ricardo Gibert wrote: > >>On July 10, 2002 at 21:31:24, martin fierz wrote: >> >>>On July 09, 2002 at 17:59:17, Robert Hyatt wrote: >>> >>>>I have been running some tests after prompting by Bruce, and the results >>>>have been interesting. >>>> >>>>The question posed by him was "how many hash collisions (signatures match >>>>but positions do not) can we tolerate in a search without having serious >>>>problems?" >>>> >>>>I did the following: I made my HashProbe() routine use the node counter, >>>>and every N nodes, I would "match" a hash entry even if the signatures were >>>>not the same, simulating a collision. I got all the way down to a collision >>>>every 1000 nodes without seeing even a score change at the root, which was >>>>surprising. >>>> >>>>I would like to ask others to try this as well. It only adds a couple of >>>>lines to your HashProbe() code. I started at one "collision" every 100K >>>>nodes, but that did nothing. Then one every 10K. And finally one every >>>>1K. This seems surprising to me. I ran several positions, with and without >>>>the "collision problem" and compared my logfile output. Scores did not change >>>>at all. I used tactical positions like Kopec22, opening positions, and even >>>>the classic fine #70. >>>> >>>>Seems that the search is far more resistant to such errors than anybody has >>>>thought previously. It would be interesting to see if this is just a result >>>>of Crafty, or if it holds for others. Particularly for someone that hashes >>>>in the q-search, which I do not do... Note that I did not false match every >>>>N calls to HashProbe() rather I false matched on the next call to HashProbe() >>>>after having searched N nodes since the previous call. Sometimes this would >>>>be less frequently than once per 1000 nodes obviously, since I could burn >>>>several thousand nodes in the q-search and not do another false match until I >>>>get back out of that mess and back into the normal non-qsearch part. >>> >>>i made a small experiment: my checkers program uses 2 32-bit numbers as hash >>>signature, the first is used to compute the table position with number%hashtable >>>size, the second is stored in the table for verification. i ran a test suite, 78 >>>positions, each searched to depth 13 and depth 19. then i used only 10 bits of >>>the 32-bit verification number to verify and did this thing again. result: out >>>of 156 scores and moves, 3 scores were different, by a very small amount (0.02, >>>0.04 and 0.04) and the moves were the same, and in one instance the score was >>>the same but the move different. i checked and the 10-bit move was actually an >>>error. of course using 10 bits for verification purposes is ridiculously little. >>>i repeated the experiment using 20 bits, and now all scores and all moves were >>>identical, although the node counts sometimes were slightly different. >>>anybody want to claim 32 bits is too little for a hash signature? :-) >>> >>> >>>if my math doesnt let me down, this type of experiment actually should produce a >>>fixed error rate: if you use 10 bits, the probability that you have the same >>>number in your signature as the one in the table is 1/1024 (if you have a >>>different position of course...). so you should be seeing 1 error in 1024 with >>>this setup, if you have a single probe in the hashtable. >>> >>>implementation details: my program uses MTD(f) and has a bucket size of 2, i.e. >>>it always probes two consecutive hashtable entries. so maybe my error rate was >>>even higher than 1 in 1024 in the 10-bit-signature experiment. >>> >>>aloha >>> martin >> >>This looks like a better test than RH's, but I think we are getting the cart >>before the horse. > >First, the definition of "better" is important. > >My last test was one error ever 1,000 positions. _very_ close to his >error rate of one ever 1024 positions. > >However, his has "softer" errors than my test. IE at least his hash >signature matches in _some_ bits, meaning that it is possible that the >positions are fairly close. Mine matched in no bits, so that the error >value could have come from a wildly different position. i don't believe this "softer" thing. just because two hash signatures = random numbers match in the last 10 bits should mean *nothing* at all about how close the two positions are on the board. besides, your numbers surely did not match in no bits, because you were producing a collision between two random numbers x and y - and it would be a very big coincidence indeed if x==~y ! so your numbers should have matched on about half the bits - of course again meaning nothing at all :-) >> It would make more sense to test a plain vanilla alpha-beta >>program first before going on to tackle PVS and/or MTD(f). Also, if what I think >>is going on is correct, then any kind of forward pruning should affect the >>results somewhat. The variables complicating the result ought to be minimized. >> >>What I expect is going on is alpha-beta examines mostly really crappy >>continuations and as one might expect with really crappy continuations, there is >>more than one way to refute them at more than one point in a continuation. This >>is a type of redundancy that mitigates the errors. I believe only a miniscule >>percentage of all continuations are sensitive to collision error. This >>percentage is miniscule enough to make even a 1 in 1000 chance of collision to >>be not so bad. > > >Apparently so, although more testing is needed. And the real question is >"how many false matches before you are into a high-probability of seeing a >real error in a game? That is not answered yet at all. Just because we >produce the same move/score for a few test positions, doesn't mean that the >game won't have a different move or score, which _could_ be a problem. that's right of course. 1 different move in 78 positions is too little. but if you do this test with e.g. 1000 positions and find 10 where you play a different move, you have a reasonable estimate of 1 different move per 100 moves played by your program. if i were to go to less bits to get to say 8 errors in 78 positions that would also be reliable enough for 10% errors. >I don't personally feel comfortable with any kind of error without knowing >how it is going to influence actual games. IE if it is going to produce one >wrong move in 10 games, that is enough to lose a game against a GM, and losing >one of 10 to an error is not acceptable to me, period. agreed. find yourself enough test positions and do the experiment. if you have enough positions you can plot the rate of "% different moves" to bits in hash signature and extrapolate to whatever you are really using. in my test, i only have 78 positions so i can't go to small % different moves. i suppose that the single hash-collision frequency should go down by 2^12 with 32 bits instead of 20, where my program already played 78 moves without error. now if the single hash-collision frequency is proportional to the error probability at the root, then 32 bit is pretty safe :-) aloha martin
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.