Author: Rémi Coulom
Date: 23:37:09 10/20/01
Go up one level in this thread
On October 21, 2001 at 00:11:23, James Swafford wrote: >be 'vectorized', or placed in a vector. OK. Additionally, >eval(pos,w) must be a differentiable function of its parameters >w (w1...wk). > >Exactly what does that mean? I've had some calculus, and I've >had some linear algebra, but that eludes me. You do not have to worry about that. All sensible evaluation functions can be considered "differentiable" in the sense that this algorithm works for them. Actually, I suppose most evaluation functions are not differentiable since they use integers or thresholds. > >Next - at the end of the game, TDLeaf will update the parameter >vector. It does this, if I understand the paper correctly, >by the following: > > w = w + lr * s * t, > >where lr is a scalar for learning rate, s is the sum of >the vectors of partial derivatives of the evaluation at each >position with respect to its parameters (w1....wk), and I don't >want to get into t right now for fear of complicating my question. > >How do I compute a vector of partial derivatives of the eval >at any position with respect to its parameter weights? If your evaluation function is, say, f(w_1, w_2, ..., w_n), then the partial derivative of f with respect to weight w_i is the limit of (f(w_1, ..., w_i + epsilon, ..., wn) - f(w_1, ..., w_i, ..., w_n)) / epsilon when epsilon goes to zero. In practice, you can estimate this value by measuring the ratio above with a very small value of epsilon. > >Forgive me if I'm being horribly unclear or I've botched the >algorithm; I'm trying to make sense of the thing. > >-- >James
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.