Computer Chess Club Archives


Search

Terms

Messages

Subject: calculating the PV using MTD

Author: Fabien Letouzey

Date: 01:08:52 09/11/02



Hello,

Recently Uri Blass (I think it was him) and Martin Fierz asked about how
to get a PV from MTD (not using the hash table).  I do not use MTD, but I
thought a bit about how to get a PV in general (using a "multiple searches"
algorithm).  Until now I had not considered that this could be interesting
for anyone (plus the idea is completely obvious), but since Uri and Martin
were apparently not answered, I share it with no garantee of correctness.

---

Assumptions.

- You want to get the PV.
- You do not want get the PV from hash table.
- You know how to get the PV from plain alpha-beta
  (otherwise use the CCC search engine).
- You use a "multiple searches" algorithm
  (aspiration alpha-beta, PVS, MTD, ...).
- You do not widen the windows to ensure a correct PV
  (using [value-1,beta] instead of [value,beta] in case of a PVS re-search
  is widening).

---

Disclaimer (silly attempt to avoid yet another time-wasting flame thread).

- I do not claim that calculating the PV is better than getting it from
  the hash table.
- I do not claim that not widening the windows for PVS and/or adjusting
  the alpha-beta window from the hash table is a clever thing to do.
- I do not claim that my idea is original or even interesting;
  like I said, some programmers asked for this.
- I do not claim that this method is fast; it is even slower than the
  traditional way of updating the PV when (value > alpha && value < beta).

More importantly,

- *I do not garantee that the PV is always complete*.  It is not possible
  to garantee a full PV in the general case unless you widen the windows
  a bit (which is obviously impossible for MTD).  Never forget that those
  "better" algorithms like PVS or MTD (as compared with plain alpha-beta)
  actually compute fewer information.  The value is correct but you lose
  a bit in the PV and the hash table.

---

I use what I call here a semi-PV.  The idea is so obvious that it must
have had a name for decades, and I would be glad to learn about it.

Small plan:

1) What is a semi-PV and how to calculate it?
2) How to compose semi-PVs to obtain a PV?

---

1) What is a semi-PV and how to calculate it?

I define a semi-PV as the sequence of moves which leads to the search
value, even if it is a bound.  So getting the semi-PV is very easy,
just update it at the same place where you update the node value.

e.g. with a negamax fail-soft alpha-beta (pseudo code):

int alpha_beta(int alpha, int beta, int depth) {

   if (depth == 0) return quiescence(alpha,beta);

   best_value = -infinity; /* value outside the search and eval ranges */

   for (move in moves) {

      make_move(move);
      value = -alpha_beta(-beta,-alpha,depth-1);
      unmake_move(move);

      if (value > best_value) { /* always true for the first move */
         best_value = value;
         update_pv(move); /* yes here, not when value > alpha */
         if (value > alpha) {
            alpha = value;
            if (value >= beta) break;
         }
      }
   }

   return best_value;
}

Comments:

I repeat, the only important thing is that the semi-PV always contains the
move which leads to the returned value (here best_value).  Since the
semi-PV is updated much more often than a traditional PV (most of the work
beeing useless, but you cannot know beforehand), its computation is slower.

---

2) How to compose semi-PVs to obtain a PV?

A PV is defined when you have an exact value at a given node.  I assume
here that you get this exact value by a fail-high and a fail-low.  If you
get the value in a single search, there is no problem with getting the PV.

e.g. you have two searches:

window  value meaning    semi-PV
------- ----- ------- -------------
[+3,+4]   +3   <= +3  e4 e5 Bc4 Nf6
[+2,+3]   +3   >= +3  e4 e5 Nf3 Nc6

You need to keep the semi-PVs from both searches to calculate the node
PV.  If you used more searches to get the result, only those two are
relevant.

The lazy way to obtain a PV from this is to copy the semi-PV moves as long
as they do agree, so "e4 e5" would be the PV here.  Actually you can do
better.  After the last "agreed" move one of the semi-PVs is correct for
an extra ply.
For a fail-high search, all odd-numbered moves (starting at 1) are "good"
moves (they all fail high).  Informally, a fail-high search is a proof that
you (the player who make the first move) get at least this result, so your
moves are good enough.  For a fail-low search, all even-numbered moves are
"good" moves.  A fail-low search is a proof that your opponent gets at least
this result, so his moves are good enough.
So in my example Nf3 is correct, and the PV would be "e4 e5 Nf3". It is
truncated here because there is no proof that Nc6 is optimal.  Note that
some programs do not truncate here and can display incorrect PVs instead.

---

Sorry for the awkward style, it is my first posting here.  I tried to be
precise to avoid misunderstanding.

Hope it helps,

Fabien Letouzey.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.