Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Temporal Difference

Author: Bas Hamstra

Date: 18:11:04 01/06/01

On January 06, 2001 at 15:08:11, Rémi Coulom wrote:

>On January 05, 2001 at 17:36:55, Bas Hamstra wrote:
>
>>I would like to share experience with some that have tried Temporal Difference
>>learning. Currently one of the problems I see is that for example BISHOPMOBILITY
>>has very large partial derivatives. So the updates for this term swing wildly
>>and distort learning. On the other hand, if I reduce the learning factor to
>>bring this in proportion, a term like DOUBLEDPAWN won't ever get to a realistic
>>value.
>>
>>One way to do better is to work with derivatives -1 or +1 only, depending on if
>>the partial derivative for a term is above or below zero. This results in a
>>tendency to realistic values for most terms.
>>
>>Still, a few terms refuse to show a "trend" at all, or even the wrong trend. Are
>>others having this problems?
>>
>>(To see what is going on I showed the developments of the weights in a graph, to
>>verify it does something useful, best results so far with -1/+1 only, bad
>>results when using the real derivatives)
>>
>>
>>Regards,
>>Bas.
>
>I have no experience in using TD(lambda) for chess, but I know a little about
>reinforcement learning and neural networks and what you describe looks like a
>typical ill-conditionning problem. (I am currently using TD(lambda) to solve
>control problems, but it works rather similarly). You can take a look at:
>
>ftp://ftp.sas.com/pub/neural/illcond/illcond.html
>ftp://ftp.sas.com/pub/neural/FAQ2.html#A_illcond
>
>The lazy solution to this problem consists in tweaking coefficients for weights
>in order to improve the condition number. The harder way is to use more advanced
>learning algorithms than vanilla gradient descent (conjugate gradient, for
>instance) as explained in the links above.
>
>A good reference for the theory of this kind of algorithms is
>http://www.athenasc.com/ndpbook.html
>
>I hope this helps.
>
>Remi

Thanks Remi. Not easy though, if you're not too familiar with the terminology.

Bas.

Re: Temporal Difference Rémi Coulom 02:23:53 01/07/01

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.