Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Questions about Tridgell/Baxter's paper on TDLeaf

Author: Rémi Coulom

Date: 23:37:09 10/20/01

Go up one level in this thread


On October 21, 2001 at 00:11:23, James Swafford wrote:

>be 'vectorized', or placed in a vector.  OK.  Additionally,
>eval(pos,w) must be a differentiable function of its parameters
>w (w1...wk).
>
>Exactly what does that mean?  I've had some calculus, and I've
>had some linear algebra, but that eludes me.

You do not have to worry about that. All sensible evaluation functions can be
considered "differentiable" in the sense that this algorithm works for them.
Actually, I suppose most evaluation functions are not differentiable since they
use integers or thresholds.

>
>Next - at the end of the game, TDLeaf will update the parameter
>vector.  It does this, if I understand the paper correctly,
>by the following:
>
>   w = w + lr * s * t,
>
>where lr is a scalar for learning rate,  s is the sum of
>the vectors of partial derivatives of the evaluation at each
>position with respect to its parameters (w1....wk), and I don't
>want to get into t right now for fear of complicating my question.
>
>How do I compute a vector of partial derivatives of the eval
>at any position with respect to its parameter weights?

If your evaluation function is, say, f(w_1, w_2, ..., w_n), then the partial
derivative of f with respect to weight w_i is the limit of
(f(w_1, ..., w_i + epsilon, ..., wn) - f(w_1, ..., w_i, ..., w_n)) / epsilon
when epsilon goes to zero. In practice, you can estimate this value by measuring
the ratio above with a very small value of epsilon.

>
>Forgive me if I'm being horribly unclear or I've botched the
>algorithm; I'm trying to make sense of the thing.
>
>--
>James



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.