Computer Chess Club Archives


Search

Terms

Messages

Subject: Thanks! n/t

Author: Daniel Shawul

Date: 22:39:56 06/17/05

Go up one level in this thread


On June 17, 2005 at 19:10:08, Vincent Diepeveen wrote:

>On June 17, 2005 at 01:52:45, Daniel Shawul wrote:
>
>>On June 16, 2005 at 18:01:58, Vincent Diepeveen wrote:
>>
>>>On June 15, 2005 at 16:36:54, Eric Oldre wrote:
>>>
>>>Of course the more time you invest in debugging the better.
>>>
>>
>>Hi
>>i have tried to output the tree to disk,when i tried parallel
>>tree searching using PV split method. Thread1 outputs to file1,and
>>Thread2 to file2. But the output produced is unreadable.
>>you might have already explanined how to do this, but i couldn't get it.
>>best
>>daniel
>
>For parallel search what really debugged all my bugs is next :
>
>at startup i allocate 1 big shared memory block which is a huge array of
>integers. additional i have 1 global volatile lock in the shared memory.
>
>each processor is doing next in search. If they hit at a line that writes debug
>info, then i'm doing basically this :
>
>In the code you will find for example this :
>
>      #if DEBUGNODESVERBOSE
>      dverb_info[0] = (int)rb->idnumber;
>      dverb_info[1] = rb->alfa;
>      dverb_info[2] = rb->beta;
>      dverb_info[3] = rb->ply; // depthleft
>      DVERB(4,dverb_abigetblock);
>      #endif
>
>Now the function DVERB is doing this :
>void DVERB(int ninfo,int dverbflag) {
>  int header,i;
>  /*  header byte:
>   *  32 bits waarvan:
>   *    1   sign   (unused)
>   *   17   laatste debugsmp punt (indien aanwezig)
>   *   10   verbosepoint:    dverb_...
>   *    4   processornummer (max 0..15)
>   */
>  //printf("locking dverb ninfo=%i flag=%i\n",ninfo,dverbflag);
>  Lock(tree->lock_dva);
>  header = 0;
>  //#endif
>  header |= (dverbflag<<4)|ProcessNumber;
>  dva[tree->dva_count++] = header;
>  i = 0;
>  while( i < ninfo ) {
>    dva[tree->dva_count++] = dverb_info[i];
>    i++;
>  }
>
>  if( tree->dva_count >= DVASIZE-64 ) {
>    int tdr;
>
>    printf("het probleem is duidelijk!\n");
>    tdr = tree->dva_rond+1;
>  //  printf("Passed DVASIZE (%i bytes) already %i
>times!\n",sizeof(int)*tree->dva_count,tdr);
>    if( nprocesses == 1 && false ) {
>      //printf("gonna save it now\n");
>      Unlock(tree->lock_dva);
>      SMPdebugSAVE();
>      InitVerbose();
>      Lock(tree->lock_dva);
>    }
>    else {
>      int m,move;
>      //mprint("processes = %i rounds = %i\n",tree->nprocesses,tdr);
>      InitVerbose();
>      /* nu vastleggen het volgende: de zetstack van de verschillende processors
>*/
>      dva[tree->dva_count++] = (dverb_movestacks<<4);
>      for( i = 0; i < nprocesses; i++ ) {
>        dva[tree->dva_count++] = 5;
>        m = 0;
>        //mprint("stored line in DVA: %i ... ",tree->movesmadeindex[i]);
>        while( m < 5 ) {
>          move = m;
>          mprint("%s%s ",M2A[(move>>6)&63],M2A[move&63]);
>          dva[tree->dva_count++] = move;
>          m++;
>        }
>        //mprint("\n");
>      }
>    }
>    tree->dva_rond = tdr;
>  }
>  Unlock(tree->lock_dva);
>}
>
>So basically lock, store the search info as integers in the array and unlock.
>When it has crashed, all you have to do is write the shared memory to disk and
>then parse the shared memory until you found the processor that crashed and you
>figure out where it was and what it was doing and the state at which other
>processors were at that moment and so on.
>
>And then pray that you crash very quickly. Basically when this testbed works
>fine all you have got to do is find the quickest way to crash or get a deadlock
>or race condition in your program. Then debug backwards.
>
>That's not so easy and requires a lot of experience :)
>
>But you can get very complex parallel forms of search to work with this method.
>Because basically every processor directly locks and unlocks, the odds are not
>so big that this 'debug code' in shared memory will eat all system time.
>
>Your evaluation will, your move ordering will and so on.
>
>It's just a sequential fill of shared memory.
>
>Then additional write tools to find relevant data within that huge file.
>
>Ways to really quickly crash your program is important to find too. A good one
>i'll give for free, that's use the utmost stupid way to split, like splitting
>always and especially as much near the leaves as possible. And start searching
>with 16 cpu's :)
>
>That really crashes the program quickly :)
>
>Now the way in which i print out things. To give a real example of 16 processors
>(note my printing routines aren't even good working for it) :
>
>0...............  SplitInOthers: processnumber=0 newproc=8 idnumberposition=1236
>0...............  SplitInOthers: processnumber=0 newproc=15 idnumberposition=123
>6
>0...............  SplitInOthers: processnumber=0 newproc=1 idnumberposition=1236
>..........10.....  Splitting position id=1236
>..........10.....  Splitting position: found and moved to position 1236
>.......7........  Splitting position id=1236
>.........9......  Splitting position id=1236
>.........9......  Splitting position: found and moved to position 1236
>.......7........  Splitting position: found and moved to position 1236
>.....5..........  Splitting position id=1236
>.....5..........  Splitting position: found and moved to position 1236
>..2.............  Splitting position id=1236
>..2.............  Splitting position: found and moved to position 1236
>...........11....  Splitting position id=1236
>...........11....  Splitting position: found and moved to position 1236
>........8.......  Splitting position id=1236
>........8.......  Splitting position: found and moved to position 1236
>............12...  Splitting position id=1236
>............12...  Splitting position: found and moved to position 1236
>...3............  Splitting position id=1236
>...3............  Splitting position: found and moved to position 1236
>......6.........  Splitting position id=1236
>......6.........  Splitting position: found and moved to position 1236
>....4...........  Splitting position id=1236
>.............13..  Splitting position id=1236
>..............14.  Splitting position id=1236
>....4...........  Splitting position: found and moved to position 1236
>...............15  Splitting position id=1236
>.1..............  Splitting position id=1236
>..............14.  Splitting position: found and moved to position 1236
>...............15  Splitting position: found and moved to position 1236
>.............13..  Splitting position: found and moved to position 1236
>.1..............  Splitting position: found and moved to position 1236
>............12...  Abortme() id=1236
>............12...  splitid=0 depth=5 procs=[0,6,10,3,2,-1,9,13,4,11,5,7,14,8,15,
>1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>............12...  PartBoth: proc just returned to idle searching 0 nodes
>............12...  ActionCheck() flags=1 realply=0
>.....5..........  Abortme() id=1236
>.....5..........  splitid=0 depth=5 procs=[0,6,10,3,2,-1,9,13,4,11,-1,7,14,8,15,
>1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>.....5..........  PartBoth: proc just returned to idle searching 0 nodes
>.....5..........  ActionCheck() flags=1 realply=0
>..............14.  Abortme() id=1236
>..............14.  splitid=0 depth=5 procs=[0,6,10,3,2,-1,9,13,4,11,-1,7,-1,8,15
>,1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>..............14.  PartBoth: proc just returned to idle searching 0 nodes
>..............14.  ActionCheck() flags=1 realply=0
>...3............  Abortme() id=1236
>...3............  splitid=0 depth=5 procs=[0,6,10,-1,2,-1,9,13,4,11,-1,7,-1,8,15
>,1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>...3............  PartBoth: proc just returned to idle searching 0 nodes
>...3............  ActionCheck() flags=1 realply=0
>......6.........  Abortme() id=1236
>......6.........  splitid=0 depth=5 procs=[0,-1,10,-1,2,-1,9,13,4,11,-1,7,-1,8,1
>5,1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>......6.........  PartBoth: proc just returned to idle searching 0 nodes
>......6.........  ActionCheck() flags=1 realply=0
>.............13..  Abortme() id=1236
>.............13..  splitid=0 depth=5 procs=[0,-1,10,-1,2,-1,9,-1,4,11,-1,7,-1,8,
>15,1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>.............13..  PartBoth: proc just returned to idle searching 0 nodes
>.............13..  ActionCheck() flags=1 realply=0
>....4...........  Abortme() id=1236
>....4...........  splitid=0 depth=5 procs=[0,-1,10,-1,2,-1,9,-1,-1,11,-1,7,-1,8,
>15,1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>....4...........  PartBoth: proc just returned to idle searching 0 nodes
>....4...........  ActionCheck() flags=1 realply=0
>.1..............  Abortme() id=1236
>.1..............  splitid=0 depth=5 procs=[0,-1,10,-1,2,-1,9,-1,-1,11,-1,7,-1,8,
>15,-1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>.1..............  PartBoth: proc just returned to idle searching 0 nodes
>.1..............  ActionCheck() flags=1 realply=0
>...............15  Abortme() id=1236
>...............15  splitid=0 depth=5 procs=[0,-1,10,-1,2,-1,9,-1,-1,11,-1,7,-1,8
>,-1,-1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>...............15  PartBoth: proc just returned to idle searching 0 nodes
>...............15  ActionCheck() flags=1 realply=0
>0...............  Abortme() id=1236
>0...............  splitid=0 depth=5 procs=[-1,-1,10,-1,2,-1,9,-1,-1,11,-1,7,-1,8
>,-1,-1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>0...............  PartBoth: proc just returned to idle searching 0 nodes
>0...............  ActionCheck() flags=1 realply=0
>..........10.....  Abortme() id=1236
>..........10.....  splitid=0 depth=5 procs=[-1,-1,-1,-1,2,-1,9,-1,-1,11,-1,7,-1,
>8,-1,-1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>..........10.....  PartBoth: proc just returned to idle searching 0 nodes
>..........10.....  ActionCheck() flags=1 realply=0
>..2.............  Abortme() id=1236
>..2.............  splitid=0 depth=5 procs=[-1,-1,-1,-1,-1,-1,9,-1,-1,11,-1,7,-1,
>8,-1,-1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>..2.............  ActionCheck() flags=1 realply=0
>.......7........  Abortme() id=1236
>.......7........  splitid=0 depth=5 procs=[-1,-1,-1,-1,-1,-1,9,-1,-1,11,-1,-1,-1
>,8,-1,-1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>.......7........  PartBoth: proc just returned to idle searching 0 nodes
>.......7........  ActionCheck() flags=1 realply=0
>........8.......  Abortme() id=1236
>........8.......  splitid=0 depth=5 procs=[-1,-1,-1,-1,-1,-1,9,-1,-1,11,-1,-1,-1
>,-1,-1,-1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>........8.......  PartBoth: proc just returned to idle searching 0 nodes
>........8.......  ActionCheck() flags=1 realply=0
>...........11....  Abortme() id=1236
>...........11....  splitid=0 depth=5 procs=[-1,-1,-1,-1,-1,-1,9,-1,-1,-1,-1,-1,-
>1,-1,-1,-1] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>...........11....  PartBoth: proc just returned to idle searching 0 nodes
>...........11....  ActionCheck() flags=1 realply=0
>.........9......  MakeSplittable() id=1233 realply=3
>.........9......  registeridlesplits owner proc=9
>.........9......  splitid=0 depth=3 procs=[11,8,7,2,10,0,9,15,1,4,13,6,3,14,5,12
>] daughters=[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1]
>.........9......  SplitInOthers: processnumber=9 newproc=11 idnumberposition=123
>3
>and so on
>
>As you see this output is just printing all actions about parallel search, not
>even printing info regarding the actual searched tree (as that would be hundreds
>of megabytes).
>
>The huge advantage of writing the debug info to shared memory is that it eats so
>little time that the processors get more time into their splitting routines and
>get therefore bigger odds to crash if there is a bug.
>
>>>Apart from boundschecking with a real boundschecker; gcc has a great
>>>boundschecker extension at their homepage, windows has nothing that impressed me
>>>in fact like devpartner doesn't even find the most obvious boundsbug in my GUI.
>>>Further there is normal debugging ways, the most effective way in which i debug
>>>Diep if there is problems is in the next trivial way:
>>>
>>>That simply printing the entire search tree to disk. That can be done either
>>>direct or indirect :
>>>
>>>example output (written by hand just to show example now for 4 processors) :
>>>
>>>0... searching root
>>>0...  makemove a2a3
>>>0... a2a3 makemove d7d5
>>>0... a2a3 d7d5 eval -0.200
>>>0... a2a3 d7d5 ab=[0.001;0.002] cutoff -0.200
>>>...
>>>
>>>And when processor 1 is doing an action :
>>>.1.. blabla
>>>
>>>In short this is a very powerful way to debug search or debug parallel bugs.
>>>Of course for tough race conditions in a parallel search it is not so clever to
>>>directly print to textfile, as the time spent to print is too long (which means
>>>a deadlock or race conditions will happen not so soon as the time spent to
>>>printing is too huge).
>>>
>>>So i print it in a shared block of memory (locking first of course) and later
>>>parse that memory block to the above text.
>>>
>>>In this way i managed to debug the parallel search of Diep.
>>>
>>>For single cpu just printing directly to disk will do fine and you can examine
>>>how big your trees get at small search depths up to a ply or 6 is no problem.
>>>
>>>You'll find many weird stuff in your qsearch and your move ordering like this.
>>>
>>>Of course it's time consuming to do the above (reading those huge textfiles) but
>>>it's very rewarding.
>>>
>>>The above has another huge advantage and that it is finding bugs in your
>>>evaluation function and directly gives you an answer whether a new compiler
>>>sucks, is dangerous to use or whether you have a so far unfound bug which
>>>boundscheckers didn't find.
>>>
>>>That is, provided you also print the evaluation score in the textfile.
>>>
>>>You can compile at different platforms with different type of compilers your
>>>code and just compare the textfile verbose. If it is not 100% similar, then
>>>obviously there is a bug somewhere.
>>>
>>>This really will remove about all nasty to find bugs in your entire
>>>search&evaluation function.
>>>
>>>It is the by far the best test to see whether a new compiler has bugs.
>>>
>>>Especially many intel compilers and certain gcc compilers did not pass this test
>>>in past when using -O3 or in case of intel just -O2. gcc 3.3.x and 3.4.x was
>>>real real buggy here. 4.0 is far better.
>>>
>>>Of course the fact that you *seriously* look to debugging methods in your engine
>>>already means that your thing is going to be more bugfree than majority of
>>>amateur programs. Most haven't heard in fact from the word 'boundschecker'.
>>>
>>>The fact that out of any random compiler with this trick i can find
>>>optimizations that produce bugs, tells me how poor the different compiler teams
>>>are in debugging their compilers.
>>>
>>>Diep is not using any difficult casting construction or so, i never take risks
>>>there. Additional it is 100% integer code, so no floating point tricks are
>>>possible for compilers.
>>>
>>>Especially the GCC team is the most horrible code writing team on planet earth.
>>>
>>>Like 99% of their releases suffers so much from bugs that i can't even compile
>>>with -O3 without catching bugs.
>>>
>>>Obviously that i can prove that with such a simple debugging method is telling
>>>really a lot about how powerful the verbose printing method of a search is.
>>>
>>>>I was wondering if anyone would like to volunteer any tricks they've
>>>>used to help find certain bugs in their search function.
>>>
>>>>I think that the ideas of using perft for move generation and
>>>>reversing the board to find bugs in the evaluation have both
>>>>been really useful to me. I was wondering if anyone has used
>>>>techniques similar to these to help find search bugs.
>>>>
>>>>I understand that just because a engine can properly pass these
>>>>and other tests doesn't mean it's free of bugs. But they certainly
>>>>help detect at least some of them.
>>>>
>>>>I'm certain that there must be plenty of bugs in Latista's search
>>>>and I think it's time for me to work on discovering them. If
>>>>you don't have any automated tricks like above. Does anyone
>>>>have any general advise to help me spot some bugs?
>>>>
>>>>Eric Oldre
>>>>
>>>>PS. I have at various spots in my program tried to follow a similar
>>>>model of asserts as in fruit. I'm sure taking some time to
>>>>do this at more parts of my program would help.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.