Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Most obscure bug *ever*

Author: Sune Fischer

Date: 13:13:00 10/23/02

Go up one level in this thread


On October 23, 2002 at 15:25:04, Colin Frayn wrote:


>Then I managed to add in some debugging code which quit as soon as the key
>stored in the Board structure was no longer correct.  I got Beo to print out the
>position.

I'm surprised you have managed this long without such a util.

I have this at the end of my MakeMove(), and I couldn't do without it! :)

#ifdef DEBUG_MODE
	ppos->board=*this;
	ValidateBoard(move);
	ValidateAttacks(move);
	ValidateIncrementalScore();
	ValidateKey(move);
	ValidatePawnKey(move);
#endif

I know instantly which move causes the error and can simply print the position
before and after the move.

That catches a lot obviously, but yesterday I broke something in the
UndoNullMove() and the error wasn't caught until the search returned to that ply
and tried the next move there. Took me some time find that.

>I couldn't help but notice that in just three ply, black had made one
>move but white had somehow moved his king about 4 squares.  Castling problem?
>By this time I had also replaced the entire random number generation code, and
>added in debugging code all over the place to print out messages in case of
>errors, but I felt I was getting closer.
>
>Then I suddenly realised.  Carlos had been sending strings to the engine of the
>form;
>3r4/5b2/1k1r1p2/Np5p/4P1p1/2R1KPP1/2P4P/R7 w
>
>(i.e. missing off the last two dashes.)  The full string should be
>3r4/5b2/1k1r1p2/Np5p/4P1p1/2R1KPP1/2P4P/R7 w - -
>
>but of course the shortened version should still be valid.  One of the positions
>I had 'fixed' before started working mysteriously after I decided to add back in
>those two dashes just for neatness.  Another started working again for a short
>while when I cut and pasted the entire line minus the end-of-line character, but
>then failed again shortly after one I cut out the whole line including the EOL
>char again.  Of course I didn't notice these at the time because I was so fixed
>on testing my hashtable.
>
>So what was the bug?  My FEN parser screwed up when it encountered a DoS EOL
>character whilst running on a non-DoS machine.  It ended up interpreting it as a
>totally spurious castling permission, sometimes allowing players to castle when
>they shouldn't have been able to and therefore messing everything up, but
>essentially at random, because the parser was just reading in the string for the
>FEN past the end, and stopping when it came to a space character.  If it met a
>K,Q,k or q in that space it would update the castling permissions, but of course
>this was random unallocated memory, so the result of this was also essentially
>random, meaning that sometimes it failed, and sometimes it didn't.
>
>So after all of this, it was just me writing a bad parser ;)
>
>How annoying is that?  :)
>Can anyone beat this story?

Yeah, the trick is to reproduce the error, the rare they are the worse they are
in a sense.

-S.

>Cheers,
>Col



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.