Author: Bas Hamstra
Date: 18:11:04 01/06/01
Go up one level in this thread
On January 06, 2001 at 15:08:11, Rémi Coulom wrote: >On January 05, 2001 at 17:36:55, Bas Hamstra wrote: > >>I would like to share experience with some that have tried Temporal Difference >>learning. Currently one of the problems I see is that for example BISHOPMOBILITY >>has very large partial derivatives. So the updates for this term swing wildly >>and distort learning. On the other hand, if I reduce the learning factor to >>bring this in proportion, a term like DOUBLEDPAWN won't ever get to a realistic >>value. >> >>One way to do better is to work with derivatives -1 or +1 only, depending on if >>the partial derivative for a term is above or below zero. This results in a >>tendency to realistic values for most terms. >> >>Still, a few terms refuse to show a "trend" at all, or even the wrong trend. Are >>others having this problems? >> >>(To see what is going on I showed the developments of the weights in a graph, to >>verify it does something useful, best results so far with -1/+1 only, bad >>results when using the real derivatives) >> >> >>Regards, >>Bas. > >I have no experience in using TD(lambda) for chess, but I know a little about >reinforcement learning and neural networks and what you describe looks like a >typical ill-conditionning problem. (I am currently using TD(lambda) to solve >control problems, but it works rather similarly). You can take a look at: > >ftp://ftp.sas.com/pub/neural/illcond/illcond.html >ftp://ftp.sas.com/pub/neural/FAQ2.html#A_illcond > >The lazy solution to this problem consists in tweaking coefficients for weights >in order to improve the condition number. The harder way is to use more advanced >learning algorithms than vanilla gradient descent (conjugate gradient, for >instance) as explained in the links above. > >A good reference for the theory of this kind of algorithms is >http://www.athenasc.com/ndpbook.html > >I hope this helps. > >Remi Thanks Remi. Not easy though, if you're not too familiar with the terminology. Bas.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.