Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Most obscure bug *ever*

Author: Will Singleton

Date: 13:32:36 10/23/02

Go up one level in this thread


On October 23, 2002 at 15:25:04, Colin Frayn wrote:

>As most of you know, writing chess programs is fairly tricky.  Weird bugs arise
>to do with NULL move and hashtables that take ages to fix, especially when you
>get lots of complicated algorithms all working against each other.
>
>So imagine my horror when Carlos at Chessbrain.net told me that Beowulf was
>occasionally returning empty PV strings from searches, indicating that the
>hashtable was broken (I get the PV from the hashtable directly).  This annoyed
>me especially as I'd not heard of this problem before and I realised I must have
>broken something recently.
>
>I tried to verify this on my PC at home in Windows, but couldn't.  Carlos tested
>it and we discovered that the problem only ever occurred rarely, and only on
>non-windows boxes.  At this stage I was thinking 'compiler error?' or perhaps
>'memory leak?'  Dreaded memory leaks take ages to find.
>
>Anyway, I did a *lot* of testing, with the fabled bug proving exceptionally
>elusive.  Often it would fail for a few goes and then fix itself randomly.  One
>time I was trying to get the bug to work on this one position, and it did so for
>the majority of the day, and then the following day that position was absolutely
>fine - no probem at all, whatever I did.  I began to wonder if it was something
>to do with the random number code, which would be seeded differently every time
>the program was run.  I replaced it all.
>
>Carlos provided me with a new position that failed and so I started to try the
>debugging.  Eventually I found out that the root position was not being updated
>properly.  I altered the hash replacement scheme, altered the hash update
>scheme, changed loads of things around in the search function, decided to store
>the full 64-bit hash key instead of just a 40 bit safe key.  basically, I spent
>ages trying to work out what was wrong, but still no luck.
>
>After a *lot* of testing, I finally managed to track the bug down to the fact
>that the hash key was becoming corrupted at some point during the search.  I
>began to test the DoMove() function, and also a few other things that could have
>caused this.  We installed electric fence and checked for memory leaks.  No joy
>(*sigh of relief*).  Somehow the hash key stored in the Board structure (which
>is continuously updated during DoMove()) had become corrupted so that it didn't
>correspond to the current board position any more.
>
>Then I managed to add in some debugging code which quit as soon as the key
>stored in the Board structure was no longer correct...

If you'd done that first... :)

I had a bug when I switched from the Mac to the pc.  Seems that the Mac has
different line endings for textfiles, so when I transferred the sources over to
MSVC, weird things happened.  The strangest was that the code *following*
certain commented-out code compiled differently if the comments were "//" rather
than "/* */".  It was the funniest thing.  I was trying to debug a procedure
*above* the procedure where the compiler was actually changing things.

Microsoft acknowledged the bug, and says it will be fixed in the next release.
Specifically, the text-editor is supposed to guarantee that all line-endings are
correct each time a file is saved.  But it doesn't.  For anyone whose debugger
gives the wrong line-numbers for breakpoints, better watch out.

Will



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.