Author: Scott Gasch
Date: 19:16:08 01/20/02
Go up one level in this thread
On January 20, 2002 at 19:48:36, Claudio Della Corte wrote: >On January 20, 2002 at 17:24:56, Scott Gasch wrote: > >[...] >>Move 118 is the bug. No idea what it was, looks to be some kind of hash >>problem. I can't get it to reproduce. > >It seems pretty much a bug due to TB, that must be handled carefully if not X >men complete. Just a guess. >Claudio Well I'm on the hunt tonight. I guess I did a stupid thing shortly after this happened I decided monsoon was totally messed up and I needed to kill it / restart before the next round. Well that overwrote the logfile with the evidence in it... dumb. So the first thing I've done is get rid of my stupid code that overwrites logfiles and make it do a numbering scheme. A little late huh? I can't reproduce the move by inputting the PGN or just running the FEN. I also tried running the positions in that game near the blunder with a full paranoid build (which is about 100nps because of all the stuff it checks) and come up with nothing. So I am left to speculate here. My first instinct is a hash bug so I've looked over my hash code very carefully, added a bunch of asserts, etc. I think I may have found a problem and I've got an int 3 on it.. if it happens I'll know. Next thing is the egtb files themselves. This is where I could use some help. I now am starting to see the reason for Bruce's "paranoia" about code he didn't write... I turned on TB_CRC_CHECK in my code that probes Eugene's tables as well as in eugene's egtb.cpp. No CRC problems to report. I am going to grab a source version of crafty and make sure this egtb code hasn't changed since I stole it and merged it into my engine. The last thing I can come up with is a bit flip in memory. Yes you think I am crazy but debugging crashed kernels at work I have seen this before, albeit rare. I've got a machine in my office where I can tell you the physical address it happens at and which bit will get asserted. Anyway, I have a stupid little memory check utility (it locks a huge buffer in physical memory and runs pattern tests on it) that I wrote will run on monsoon's hardware overnight. I don't expect to find anything... for one I can't get all the memory (drivers / kernel need some though I can get abot 85% of it) and for two the memory system of my PC is a hell of alot better tested than my chess engine. So this is a shot in the dark. I also went to every place in my code where I typecast something and double checked the asserts above the cast. I was thinking I could have dropped a sign bit in a cast or something but this seems unlikely now. I'm more than open to suggestions about what else to try -- I'd be grateful for any ideas. Thanks, Scott
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.