Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: quarkx v monsoon-ccct4

Author: Ulrich Tuerke

Date: 07:58:12 01/21/02

Go up one level in this thread


On January 20, 2002 at 22:16:08, Scott Gasch wrote:

>On January 20, 2002 at 19:48:36, Claudio Della Corte wrote:
>
>>On January 20, 2002 at 17:24:56, Scott Gasch wrote:
>>
>>[...]
>>>Move 118 is the bug.  No idea what it was, looks to be some kind of hash
>>>problem.  I can't get it to reproduce.
>>
>>It seems pretty much a bug due to TB, that must be handled carefully if not X
>>men complete. Just a guess.
>>Claudio
>
>Well I'm on the hunt tonight.  I guess I did a stupid thing shortly after this
>happened I decided monsoon was totally messed up and I needed to kill it /
>restart before the next round.  Well that overwrote the logfile with the
>evidence in it... dumb.  So the first thing I've done is get rid of my stupid
>code that overwrites logfiles and make it do a numbering scheme.  A little late
>huh?
>
>I can't reproduce the move by inputting the PGN or just running the FEN.  I also
>tried running the positions in that game near the blunder with a full paranoid
>build (which is about 100nps because of all the stuff it checks) and come up
>with nothing.
>
>So I am left to speculate here.  My first instinct is a hash bug so I've looked
>over my hash code very carefully, added a bunch of asserts, etc.  I think I may
>have found a problem and I've got an int 3 on it.. if it happens I'll know.
>
>Next thing is the egtb files themselves.  This is where I could use some help.
>I now am starting to see the reason for Bruce's "paranoia" about code he didn't
>write... I turned on TB_CRC_CHECK in my code that probes Eugene's tables as well
>as in eugene's egtb.cpp.  No CRC problems to report.  I am going to grab a
>source version of crafty and make sure this egtb code hasn't changed since I
>stole it and merged it into my engine.
>
>The last thing I can come up with is a bit flip in memory.  Yes you think I am
>crazy but debugging crashed kernels at work I have seen this before, albeit
>rare.  I've got a machine in my office where I can tell you the physical address
>it happens at and which bit will get asserted.  Anyway, I have a stupid little
>memory check utility (it locks a huge buffer in physical memory and runs pattern
>tests on it) that I wrote will run on monsoon's hardware overnight.  I don't
>expect to find anything... for one I can't get all the memory (drivers / kernel
>need some though I can get abot 85% of it) and for two the memory system of my
>PC is a hell of alot better tested than my chess engine.  So this is a shot in
>the dark.
>
>I also went to every place in my code where I typecast something and double
>checked the asserts above the cast.  I was thinking I could have dropped a sign
>bit in a cast or something but this seems unlikely now.
>
>I'm more than open to suggestions about what else to try -- I'd be grateful for
>any ideas.

This reminds me on some problem which I had some time ago.
The problem was related to storing mate values into the hash table.
Usually, you add some increment ("ply" or so) to the mate value to correct for
the distance to the root of the tree. Furthermore, you probably store a flag
indicating whether it's a bound or an exact value.
When storing the flag I compared the modified mate value against alpha and beta.
This was of course wrong; i had to compare the original mate value.

Just an idea,
Uli

>
>Thanks,
>Scott



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.