Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: MTD(f) (was Re: New(?) search idea.)

Author: Robert Hyatt

Date: 19:23:36 01/22/98

Go up one level in this thread


On January 22, 1998 at 21:59:57, Don Dailey wrote:

>On January 22, 1998 at 14:28:27, Stuart Cracraft wrote:
>
>>On January 22, 1998 at 13:33:42, Don Dailey wrote:
>>
>>>By the way, I use an algorithm called mtd(f) and there is no concept
>>>of a "window" in the search itself (just the search driver.)  All
>>>searches return quickly and are guaranteed to fail!  In some sense
>>>the same moves get searched over and over on the same iteration
>>>although the hash table tends to avoid most of this work (except
>>>when it is needed to prove a point.)   The search driver is tricky
>>>piece of code and the obvious approach is quite bad (don't use a
>>>binary search mtd probe strategy!)
>>>
>>>The thing I hate about mtd is the data returned is so different
>>>from a conventional program.  You cannot get a nice list of moves
>>>showing progress and the pv appears many times (usually modified)
>>>each time.  Even though I have good root move ordering I stopped
>>>thinking about it with mtd(f).  Your ideas almost sounded foreign
>>>to me because of this!   Massaging my root move list would clearly
>>>be a problem for efficiency.   MTD is so good I cannot imagine
>>>using anything else though.  I am getting over 15%  speedup over
>>>my very best implementation of pvs search (margin on first move,
>>>zero width on siblings with research on fail hi's)   I prefer
>>>PVS's behavior but cannot ignore the extra performance of MTD(f).
>>>
>>>The (f) refers to the type of driver used.  I don't strictly use
>>>the 'f' driver because I found some enhancements of my own but
>>>it's basically mtd(f)
>>>
>>>- Don
>>
>>I tried MTD(f) at one point and was impressed as well. Also, I saw
>>the tables in Aske Platt's thesis comparing it with other search
>>techniques for three strategy games. Those tables show really good
>>speedups for chess.
>>
>>Now, the question. I still have my MTD(f) code but it does not
>>record the principal variation.
>>
>>How did you do it? Please be as specific as you can. Now that my
>>transposition table is working much better, I'd like to consider
>>re-using my past MTD(f) code.
>>
>>15% is nothing to sneeze at.
>>
>>--Stuart
>
>
>Hi Stuart,
>
>I have problems with PV's.  In Cilkchess 1, I implemented MTD(f) and
>Aske Plaat improved the driver (I was doing it wrong and still getting
>a small speedup over PVS.)  Aske Plaat implemented the PV reporting
>which basically walked the hash table and picked up the best moves.
>However I don't think this is a good strategy if you want accurate
>PV's.   Cilkchess 1 had trouble reporting PV's and often the 2nd move
>would be a silly blunder.  This was scary and made us think the search
>was broken but it solved every problem in our test sets in the expected
>depth.  We won Dutch championship without a single loss out of 11 games
>and pronounced the search SOUND.   We are now are confident it's a PV
>issue only.  But for the tournament we simply did a 2 ply search to
>get the opponents move and hoped the quality of it was high enough to
>be a good predictor.  We never got blunders but often predicted weaker
>moves or did not see simple tactics in the prediction.
>
>In Cilkchess 2.0 I implemented the PV again using the same technique.
>At the Dutch this year we kept predicting these horrible moves for our
>opponents and basically rarely got to think on the opponents time.  Our
>hash table was broken too and we operated with 1/16 of the hash table
>we could have used and this may also affect the PV thing.  We lost a
>game and could only manage 2nd place with all that hardware and our PV
>implementation is partly to blame (and the fact that the program
>basically
>sucks!)
>
>We tried to fix the PV problem in the middle of the tournament, mainly
>by paying attention to "beta."   We have no Alpha in our search and
>beta is equivalent to alpha (and beta) in other programs.  We did some
>hack to make sure we did not change the 2nd move in the hash table
>unless the score is properly bounded.  On my list of "things to do"
>is to experiment with this in general (not for PV reasons) to see if
>there are conditions under which you should or should not change the
>best move in the hash table.  The end result seemed (this is a
>subjective
>judgement) to be an improvement and in 1 game we actually predicted
>almost every single move of the opponent, having 45 minutes on our
>clock at move 60 while the opponent had to reset for speedchess (they
>allow this in this tournament.)   I thought the problem was fixed for
>good but in the next few games the bad behavior reared it's ugly head
>again but seemed much less severe.  All of my non MTD programs had
>sound PV's to the best of my knowledge.
>

when I tried this, I used (eventually) the idea of storing two bounds.
When
I stored a fail high bound, I stored the best move.  When I stored a
fail low
move, there obviously is no best move to store.  In this case I left the
fail-high move there.
This partially worked.  But it is *far* from perfect.  Particularly if
you
get the occasional lucky break of getting one fail high, followed by
several
fail lows.  It is possible to overwrite enough stuff that the PV simply
isn't
available.

Of all the problems I had, this was the worst.  1. I want good moves.
15%
faster is a totall loss if you can't predict good moves for the
opponent,
because the lost "ponder" time is *much* worse than 15%.

Lazy eval is another big problem... because I cut off evaluating if It
seems
I can't get close to alpha, or close to beta depending.  And when I back
this
partial eval back to the root, it won't give me a good estimate for the
next
mtd(f) re-search.  So I move the window up, search again, and the score
goes
even higher.  Until I finally reach a point where the lazy eval won't
exit
early.  But 3-4-5 re-searches and this is not as efficient as pure PVS.

That's where I left off.  My lazy eval gets much more than 15%.  I'd
have
to discard that or suffer thru extra re-searches...


>I am soon going to implement PV's the right way and tackle this problem.
>Essentially a PV is just information passed toward the top of the tree
>(the
>root node) in recursive fashion and is trivial to implement.   I don't
>know what problems I will find but I intend to get it right.  I will
>definitely post my results if you find it interesting.

I don't see how you can get a good PV from mtd(f).  The problem is that
every node fails either high or low.  The ones that fail low give you
zero
information on which move is best.  The hash table can't be depended on
to
provide this any more than it can in a normal PVS search, because of the
overwriting problem, not to mention the problem of storing fail-low
results
that are just as important (actually these are the *most* important when
you measure performance) but which don't have a best move.  If you are
lucky enough to overwrite an identical position that was a fail high or
PV
node, you can save the best move, but this didn't happen often for me...



>
>MTD(f) is hell bent on not doing any more work than necessary and causes
>problems with PV's but I don't think they are unsolvable.  I definitely
>want sound PV's.  I mentioned there were things about MTD I didn't like
>and this is only one of them.   I would love to throw it out completely
>and actually tried to, but could not even come within 15% of matching
>its  performance.  It's really quite powerful on the serial as well as
>the parallel program.   Unless you are a performance freak (which I am)
>don't use it!
>


Hmmm...  I have *always* been a performance freak.  :)  That's why I
have
worked on parallel search algorithms..., wrote 20,000 lines of Cray
assembly code, etc.  and then did the bitmap algorithms now used in
Crafty...  I *never* ignore potential speedups...  When I first heard
about
it I downloaded Aske's paper about it and went to work...

However, have you done a lazy eval?  If not you can get 15% there, but
it
will play hell with mtd(f) by not letting fail-soft give you a very
accurate
estimate.  Although in my case, when I turned it off, I still didn't get
a
very good estimate.  Closer than using either alpha or beta of course,
but
not close enough to make it pay off for me...


>
>- Don



This page took 0.05 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.