Author: Bas Hamstra
Date: 09:11:32 01/06/01
Go up one level in this thread
Hi James! On January 06, 2001 at 11:48:44, James Swafford wrote: >On January 05, 2001 at 17:36:55, Bas Hamstra wrote: > >You're a little ahead of me. I've been wanting to do some work with >TD for some time, but I've still got a couple months of work to do >before I even get started. I do have a couple questions for you, >though: > >1. Did you start with realistic weights, or did you begin with >random values, or ??? I started with all weights zero. To see it did something reasonable, I let it play a pool of 100 slightly randomized hand-tuned evals. You don't need 100 programs, just 100 weight sets and one program. It then learns to score 50% in 200 or 300 games. I plotted the weights real time in a graph to see how they developed. This was my first test. Second step was to put it in my console based "real" program and it is playing Crafty right now. >2. What do you mean by "wrong trend?" I suppose you mean a term >is "drifting" the wrong way... becoming more negative when it should >be going more positive? Yep. Say 90% of the weights tend to show reasonable values. But a few don't at all. It might be that it needs more games, though. I am not sure if this is the fastest way of automatic parameter tuning. Maybe some kind of "weight fitting" on a large set of positions is more efficient, but how? >3. How are you training your evaluator? With a wide variety of >opponents, or by playing the same programs over and over, or ??? >How many games have you played? Right now I am playing Crafty for a couple of hundreds of 1 0 games. >4. Does your engine compete on ICC? A couple of times. But mostly FICS. The TD version has not played there yet. By the way: I compared what it learns from a) wins b) losses c) draws. In my opinion a) and c) did not do well. So now I only learn from losses. Never change a winning team. >James Let me know your experiences, I am interested! Ciao! Bas. > > >>I would like to share experience with some that have tried Temporal Difference >>learning. Currently one of the problems I see is that for example BISHOPMOBILITY >>has very large partial derivatives. So the updates for this term swing wildly >>and distort learning. On the other hand, if I reduce the learning factor to >>bring this in proportion, a term like DOUBLEDPAWN won't ever get to a realistic >>value. >> >>One way to do better is to work with derivatives -1 or +1 only, depending on if >>the partial derivative for a term is above or below zero. This results in a >>tendency to realistic values for most terms. >> >>Still, a few terms refuse to show a "trend" at all, or even the wrong trend. Are >>others having this problems? >> >>(To see what is going on I showed the developments of the weights in a graph, to >>verify it does something useful, best results so far with -1/+1 only, bad >>results when using the real derivatives) >> >> >>Regards, >>Bas.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.