Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: MTD is a big disaster

Author: Vincent Diepeveen

Date: 05:52:18 07/20/99

Go up one level in this thread


On July 19, 1999 at 23:35:50, Don Dailey wrote:

>I have noticed a lot of posts lately on the subject of MTD and thought
>I would give my observations and experiences.  First of all, I would say
>that MTD is simply a big win.  A lot of people have reported various problems
>with it including myself.  But all the problems are correctable and you
>will be rewarded with the fastest possible aspriation search on the planet.
>All of the problems are based on not completely understanding what is going
>on and not bothering to stick with it until you figure it out.  Even regular
>alpha beta searching is full of gotcha's and a lot of people don't fully
>understand the proper way of doing aspiration searching.   This is forgivable,
>though, it's complicated and very easy to overlook some of the hairy issues.
>It's one of those things that seems ridiculously simple once you understand
>it, but until then is not so simple.

Note that i perfectly understand both algorithms. I don't mind you saying:
"Even regular alpha beta searching is full of gotcha's and a lot of
 people don't fully understand the proper way of doing aspiration searching."

If however that line is meant for me, then i'm insulted till my bones.

>The lazy evaluation problem is one I ran into with Cilkchess.  When I tried
>to use lazy evaluation I got big speedups in terms of nodes per second,
>but the number of nodes inflated to make it NOT a win.  This was quite
>annoying but was caused because the value you  return to the mtd driver
>was often "weaker"  because of the "cheat margin" you used with your
>evaluation.    The solution is not to use beta (the single goal value
>of an mtd probe) but to use the global lower/upper bound that the mtd
>driver itself keeps track of.  Apply your scoring margins to THOSE values
>because they are "true" bounds, not speculative bounds.  I learned about
>this from discussions with others at the world computer chess championship.
>It was one of those things that should have been obvious to me but wasn't.

In DIEP i don't have a lazy eval problem, as i don't use lazy eval.
I'm completely against it. Let's say i know too much about lazy eval
to let my hashtables be spoiled with them.

I get however a 60% speedup by using a simple hashtable that only
stores my eval. This goes down to 50% speedup when loading factor
becomes huge (way above 1).

As i know most programs are not using a eval hashtable, i think
i may point to my eval table for this reason. What they achieve
with lazy eval, i achieve more or less with an eval table.

>I would like to mention that I was forced to use MTD and solve these
>problems (also problems like bad PV's) because it was simply faster,
>and if the speedup was trivial I would have gladly just avoided the
>issue!
>- Don

Don, i don't doubt MTD is faster for your programs.

Let's however write down some facts why my prog is unhappy with MTD.
It's up to others to generalize it to their progs:

  - the huge number of researches needed. In DIEP my evaluation is nowadays
    in 1/1000 of a pawn. For a long time i had 1/200 of a pawn (in the time
    i experimented with MTD), but now i have 1/1000 of a pawn. So a drop
    of 0.20 pawn, which is a normal drop in DIEP, is in fact a drop of 200
    points. Happy researching!
  - My evaluation has a lot of knowledge, so i can expect every iteration
    a different behaviour of it, although the ups and downs in evaluation
    every ply i've figured out are mainly caused by the mobility component,
    this means nevertheless that i'm sure that i need a lot of researches.
  - As my DIEP needs very little nodes a ply (must see the first prog
    needing less when also turning off the forward pruning, except for
    nullmove), although i don't dare to say that my sorting is optimal,
    *everything* but near optimal i prefer to say, this means however
    that all overhead i have to search is too much overhead for it.

    Secondly it simply happens to be the case that
    when searching using a bound of 0.10 that all scores that get returned
    are very near to it (nullmove). This has the nasty habit that a lot
    of bounds stored
    in the transpositiontable will not work next research, and as we know we
    have a lot of researches to do, then it's easy to realize that this
    gives a lot of overhead which is simply unnecessary.
  - Chess is a worst case game. That means that if you have one big weak spot,
    that you are complete history. Even at nowadays hardware a weak spot of
    DIEP is its search speed. It's getting 15k nodes a second at a PII450.
    Now i can have a branching factor like heaven, but still it won't search
    very deep, unless i give it a lot of time. At very slow levels like 3
    minutes a move it usual will even search near the opponents depth, or
    sometimes even deeper, because of this branching factor, but up to the
    first 90 seconds or so, for sure a program that's 20 times slower than
    its opponents is not likely to search deeper.

    So it's important to realize that it already doesn't search deeply,
    when getting to my point:
    in very complicated positions where the score goes up and down a lot,
    or where a few fail lows occur, that usual are very important points in
    a game. Now we all give our programs more time, but still those positions
    it not seldom searches 2 or 3 ply less or so than in other middlegame
    positions. In those complicated positions a lot of games get decided.
    It is in those positions that MTD has a major weak spot.
    MTD can't stand a lot of researches. So there where one needs its search
    algorithm the most, there MTD has a major shortcoming. There MTD shows its
    worst case and simply abandons you, needing up to 6 times more nodes.

    Now you can do the math. That's a laughable DEPTH i get then. If i
    search a few moves at a depth of 9 ply, and all others at like 13
    or 14 ply, then i'm completely smoked.

    Even the tactical barrier where we talked about some years ago,
    and was estimated by me at 12 ply, is 3 plies off that.

    Doing 9 ply searches is back to the old days!
    Now 9 versus 13/14 ply is an example, but it clearly shows my point.

    MTD is simply a worst case/ best case algorithm. If scores are
    very close to each other (usually when there is only 1 obvious move
    to make), then it will not perform bad, but quickly get to the next ply.
    If MTD scores are not close, and a lot of different PVs follow each other
    up, then MTD has a major problem.

It's obvious that MTD is a dead wrong algorithm for computerchess!

If Don would say: "average it needs less nodes in Cilkchess", then considering
how Cilkchess has been made and how all progs of Don are, i directly
believe him on his word.

What will remain of those results when you get a better eval in the future?
How about a worst case analysis of MTD versus PVS?

Vincent Diepeveen
diep@xs4all.nl











This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.