Computer Chess Club Archives

Search

Terms

Messages

Subject: Questions about Tridgell/Baxter's paper on TDLeaf

Author: James Swafford

Date: 21:11:23 10/20/01


I'm rehashing some papers by Baxter, Tridgell and Weaver.
The one I'm looking at now is entitled "TDLeaf : Combining
Temporal Difference Learning with game tree search."

I get the basic idea, but I have some specific questions I'm
hoping some of you can help me with.

The big goal, as I understand it, is to minimize the error between
the evaluation's predicted outcome and the real outcome through use
of the differences in the predicted outcome from move to move.
   td(t) = eval(pos(t+1),w) - eval(pos(t),w), where w is a
     vector of evaluation parameter values.

First, it's obvious that the evaluation parameters must somehow
be 'vectorized', or placed in a vector.  OK.  Additionally,
eval(pos,w) must be a differentiable function of its parameters
w (w1...wk).

Exactly what does that mean?  I've had some calculus, and I've
had some linear algebra, but that eludes me.

Next - at the end of the game, TDLeaf will update the parameter
vector.  It does this, if I understand the paper correctly,
by the following:

   w = w + lr * s * t,

where lr is a scalar for learning rate,  s is the sum of
the vectors of partial derivatives of the evaluation at each
position with respect to its parameters (w1....wk), and I don't
want to get into t right now for fear of complicating my question.

How do I compute a vector of partial derivatives of the eval
at any position with respect to its parameter weights?

Forgive me if I'm being horribly unclear or I've botched the
algorithm; I'm trying to make sense of the thing.

--
James

Re: Questions about Tridgell/Baxter's paper on TDLeaf Gareth McCaughan 09:18:39 10/21/01
- Re: Questions about Tridgell/Baxter's paper on TDLeaf James Swafford 18:35:14 10/21/01
Re: Questions about Tridgell/Baxter's paper on TDLeaf Rémi Coulom 23:37:09 10/20/01
- Re: Questions about Tridgell/Baxter's paper on TDLeaf James Swafford 04:06:41 10/21/01
  - Re: Questions about Tridgell/Baxter's paper on TDLeaf Rémi Coulom 03:24:45 10/22/01
    - Re: Questions about Tridgell/Baxter's paper on TDLeaf James Swafford 04:29:39 10/22/01

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.