Computer Chess Club Archives




Subject: Re: MTD(f) (was Re: New(?) search idea.)

Author: Don Dailey

Date: 21:18:18 01/22/98

Go up one level in this thread

Hi Bob,

Concerning point 1, "psuedo stable evaluations" :

We try really hard to avoid the unstable evaluations.   Our philosophy
(and I realize it's just that so don't blast me please) is that these
wild swings usually (but not always) indicate a problem with gradualism
in the program.  For instance you may give big bonuses to get to some
square but no bonus for getting close.  Or your endgame threshold is
not gradual but sudden and harsh so the king goes from being great to
sucking in one fell swoop.  When this happens you don't make fine
decisions but everything is based on the one silly point.   Some of this
however is unavoidable, for instance rook gets on 7th and then is driven
away on the next move.

Aske Plaat recommends the mtd increment be based on the granularity of
your evaluation.  If a pawn is 10,000 points your mtd increment should
be much higher.   Also we use a stand pat bonus to smooth out the odd/
even affect.  I don't know if this is common practice or not but I
others do it too.

Anyway, lot's of room for philisophical disagreement on this one and
I freely admit there are lot's of issues that interact so I'm definitely
not claiming you are doing anything wrong.

2. You mentioned extension based on the target score (alpha/beta.)  We
used to do this and eventually threw these out long ago because they
do cause problems whether you use mtd or not.  So far our program
has met your requirements for MTD candidacy!

3. We did experiment with maintaining upper and lower bounds in the
hash table and felt that it would help MTD even more.  But our
reported no improvement.  I may go back and repeat because I never trust
ideas I didn't implement myself!   I think there might be some issues
with best move too that might make a difference, namely, that with MTD
you may be storing a best move that is quite horrible, but good enough
to get a cutoff.  Usually this would be poor captures that happen to
be first in the move list.  It's probably just as well to not store
these (or overwrite these) if you can determine which ones they are.

The principal variation issue is definitely a pain but I haven't
the problem with a vengance and full concentration yet.  I think this
is a solvable problem and will probably work on this next.  I hope to
learn a lot from the effort.

You talked a little about your parallel implementation and indicated
that PVS was wrong for it.  I can't think of any inherent reason for
this but then again I do not understand the details of your

Our parallel code is trivial to implement and you can probably easily
read and understand this example code.  Note that this routine is
pretty much completely correct except for a bunch of details that do
not concern the parallel aspects of it.  Also I'm not treating nega-max
or mini-max correctly, but I'm just showing the process.  Please take a
quick look and tell me if its the basic algorithm you used:

cilk int search( POSITION *p, int depth, int move )
  POSITION   new_position;
  int        cutoff = 0;

    inlet void search_results(int score)
      /** test for cutoff from any spawned search **/

      if ( score > new_position.gamma) {
        cutoff = 1;
        abort;        /** Cilk abort function  **/

  new_position = *p;
  make( &new, move );

  /**  Search first move serially   **/

  score = spawn search( &new, depth-1, firstmove );
  sync;  /* must sync even though not parallel or this falls through! */

  if (score > gamma) return( score );

  /** search all others in parallel **/

  for (i=2nd_move; i<=lastmove; i++)
    search_results(  spawn search(&cur, depth-1, move[i]) );
    if (cutoff) break;


  return best_score;


Essentially we search the first move at any node serially (wait for
the results) and then spawn all the rest off in parallel.   In actual
practice of course 99% of all this ends up being searched serially
anyway, especially with 4 processors but if processors are available
to do the work then they will.   Cilk is just a 'c' preprocessor that
all the scheduling details.  We use a work stealing algorithm so
that when a processor runs out of work it "steals" work from another
processor in the order the work was posted (or spawned.)

When a spawn returns from it's search, the "inlet" is invoked and if
a cutoff is possible the "abort" statement will kill any work that
is currently being done by other processes and the "sync" will happen
very quickly.

I think this is essentially the same algorithm most parallel chess
programs use (Frenchess starts up all processors on the same position
and uses the hash tables for bookeeping but in principle reduces to a
very similar algorithm although it looks completely different at first

I hope others see this and realize how trivial it is to implement
a parallel chess program.  SMP are becoming more and more common
and are easily affordable to most of us.   If anyone is interested
in pursuing this, CILK is available for free and has been ported to
Alphas, Linux PC's, Sparcs and SGI machines.  We have a version
now also for networked machines.  The implementation is as easy
as you see it here.

Aske Plaat ported an old version of Crafty to Cilk.  Since Crafty
was designed to be a serial program he had some problems with all
the global state.  But even this did not prevent him from porting
it in a few days (even though he was completely unfamiliar with the
code.)   If anyone decides to write a parallel program I highly
recommend encapsulating all your position state (the board and
relevant state) into a structure.   If you are concerned that
creating new states for each node will be too expensive, don't
be.  Modern processors copy contiguous data extremely rapidly
and you avoid some work with keeping unmake information (you never
unmake a move.)   The same code (a big benefit of using cilk) will
produce a serial version of the program that runs very fast and I
promise it will be competitive with your current serial program.
(I think I hear someone in the background screaming something about
data cache but I can't quite make out what they're saying.)

Cilk was designed for chess and is extremely efficient.  You can
link in assembly code if that is your thing.  All of the parallel
user routines in cilk get compiled as 2 copies, 1 for speed and
efficiency and 1 with the parallel bookeeping overhead.  Most of
the time you are executing the fast "serialized"  code and only
when work gets stolen do you suffer from the overhead.  The net
effect is that the parallel code on 1 processor is almost as fast
as the purely serial program.  This is not typical of parallel code.

Also there is no concept of a "boss process" that divy's out work
etc.  You do not need to know or care how many processors are involved
although you have access to this information if you want it.

This post ends up sounding like a commercial.  But I wanted everyone
to know this is not a secret technology and we WANT people to use
it.  I am not afraid you guy's will get a hold of it and compete
with me either.  My programs are good enough to hold their own and
I like competition!   With CILK, parallel chess is almost trivial
and I hope some of you give it a try.

- Don

On January 22, 1998 at 15:35:39, Robert Hyatt wrote:

>On January 22, 1998 at 14:28:27, Stuart Cracraft wrote:
>>On January 22, 1998 at 13:33:42, Don Dailey wrote:
>>>By the way, I use an algorithm called mtd(f) and there is no concept
>>>of a "window" in the search itself (just the search driver.)  All
>>>searches return quickly and are guaranteed to fail!  In some sense
>>>the same moves get searched over and over on the same iteration
>>>although the hash table tends to avoid most of this work (except
>>>when it is needed to prove a point.)   The search driver is tricky
>>>piece of code and the obvious approach is quite bad (don't use a
>>>binary search mtd probe strategy!)
>>>The thing I hate about mtd is the data returned is so different
>>>from a conventional program.  You cannot get a nice list of moves
>>>showing progress and the pv appears many times (usually modified)
>>>each time.  Even though I have good root move ordering I stopped
>>>thinking about it with mtd(f).  Your ideas almost sounded foreign
>>>to me because of this!   Massaging my root move list would clearly
>>>be a problem for efficiency.   MTD is so good I cannot imagine
>>>using anything else though.  I am getting over 15%  speedup over
>>>my very best implementation of pvs search (margin on first move,
>>>zero width on siblings with research on fail hi's)   I prefer
>>>PVS's behavior but cannot ignore the extra performance of MTD(f).
>>>The (f) refers to the type of driver used.  I don't strictly use
>>>the 'f' driver because I found some enhancements of my own but
>>>it's basically mtd(f)
>>>- Don
>>I tried MTD(f) at one point and was impressed as well. Also, I saw
>>the tables in Aske Platt's thesis comparing it with other search
>>techniques for three strategy games. Those tables show really good
>>speedups for chess.
>>Now, the question. I still have my MTD(f) code but it does not
>>record the principal variation.
>>How did you do it? Please be as specific as you can. Now that my
>>transposition table is working much better, I'd like to consider
>>re-using my past MTD(f) code.
>>15% is nothing to sneeze at.
>Note that you have to have a couple of characteristics in your program
>or you will have big troubles:
>1.  you need a psuedo-stable evaluation.  It can't produce large swings
>from iteration to iteration.  If you do, you get too far off the mtd(f)
>window and require extra re-searches to get back, only to be too far off
>in the other direction the next iteration.  This killed me because my
>evaluation can produce significant variations from one iteration to the
>next (+/- .5 is not uncommon)
>2.  no extensions depending on alpha and beta values.  These cause more
>search inconsistencies than Carter's has little liver pills.
>3.  you should modify your hash table to store both upper and lower
>bounds, since you will be searching the same tree over and over, and
>failing high one time, failing low the next.
>I didn't like it because for me it was slower.  I didn't want to
>my eval to not produce significant variations from ply to ply.  And I
>not getting a good PV as it made debugging and testing very difficult.
>can get some stuff from the hash table, but not always enough.
>It has it's place in one type of parallel search.  The way I did the
>search in Cray Blitz it would be totally useless however.  I'm a PVS'er
>life, probably.  :)

This page took 0.03 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.