Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Temporal Difference

Author: Jay Scott

Date: 10:58:19 01/07/01

Go up one level in this thread


On January 06, 2001 at 12:11:32, Bas Hamstra wrote:

>On January 06, 2001 at 11:48:44, James Swafford wrote:

>>1.  Did you start with realistic weights, or did you begin with
>>random values, or ???
>
>I started with all weights zero. To see it did something reasonable, I let it
>play a pool of 100 slightly randomized hand-tuned evals. You don't need 100
>programs, just 100 weight sets and one program. It then learns to score 50% in
>200 or 300 games. I plotted the weights real time in a graph to see how they
>developed. This was my first test. Second step was to put it in my console based
>"real" program and it is playing Crafty right now.

You already know that you'll get faster learning if you start with weights
near "reasonable" values.

Here's a point that many miss: You can choose one weight and *fix* it.
For example, set the pawn score to 100 and don't let it change; leave it
out of the learning process. That's because game play only depends on the
relative values of the weights; by fixing one weight you set the scale
for the others, and prevent them from drifing around aimlessly, while
losing nothing. You gain learning speed and accuracy.

If your evaluator is measuring probability-to-win then in effect
it already uses this trick.

>By
>the way: I compared what it learns from a) wins b) losses c) draws. In my
>opinion a) and c) did not do well. So now I only learn from losses. Never
>change a winning team.

Let me suggest that you can do even better by deciding whether to learn
for each move rather than for each game.

* If the opponent played the predicted move, and the score changed, then
there is something to learn.

* If the opponent did not play the predicted move, and the score changed
in favor of the opponent, then again there is something to learn.

* If the opponent did not play the predicted move, and the score changed
against the opponent, then we can assume that the opponent blundered.
We don't want to learn from blunders.

  Jay



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.