Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Temporal Difference

Author: Bas Hamstra

Date: 14:17:06 01/06/01

Go up one level in this thread


On January 06, 2001 at 14:34:09, James Swafford wrote:

>On January 06, 2001 at 12:11:32, Bas Hamstra wrote:
>
>>Hi James!
>>
>>On January 06, 2001 at 11:48:44, James Swafford wrote:
>>
>>
>>I started with all weights zero. To see it did something reasonable, I let it
>>play a pool of 100 slightly randomized hand-tuned evals. You don't need 100
>>programs, just 100 weight sets and one program. It then learns to score 50% in
>>200 or 300 games. I plotted the weights real time in a graph to see how they
>>developed. This was my first test. Second step was to put it in my console based
>>"real" program and it is playing Crafty right now.
>
>Seems to me it would help things along if you started with reasonable
>values.  I think Tridgell / Baxter started with all weights = 0, too,
>with KnightCap, and remarked the same thing.

>>>2.  What do you mean by "wrong trend?"  I suppose you mean a term
>>>is "drifting" the wrong way... becoming more negative when it should
>>>be going more positive?
>>
>>Yep. Say 90% of the weights tend to show reasonable values. But a few don't at
>>all. It might be that it needs more games, though. I am not sure if this is the
>>fastest way of automatic parameter tuning. Maybe some kind of "weight fitting"
>>on a large set of positions is more efficient, but how?
>
>What would happen if you "moved" those weights to where you think they belong?

That would be cheating :) I want to see it work straight away and then see the
score percentage go up.

>>>3.  How are you training your evaluator?  With a wide variety of
>>>opponents, or by playing the same programs over and over, or ???
>>>How many games have you played?
>>
>>Right now I am playing Crafty for a couple of hundreds of 1 0 games.
>
>Hmmm.... so you're training your evaluator to play the best it can
>against Crafty at 1 0.  Why not ICC (or FICS) against a wide variety
>of opponents?  I'd prefer slightly slower time controls, too, although
>I know with that many games time gets to be a problem...

Yes, I want to have a reliable method first. I do not feel I have reached that
point. One the method works for 1 0 it will work for longer tc's too. The nice
thing is you can MEASURE if it works. If it works, the score must go up.

>>>4.  Does your engine compete on ICC?
>>
>>A couple of times. But mostly FICS. The TD version has not played there yet. By
>>the way: I compared what it learns from a) wins b) losses c) draws. In my
>>opinion a) and c) did not do well. So now I only learn from losses. Never change
>>a winning team.
>
>Yes, that makes sense.  Thanks for the info!  Maybe we can share notes
>in a few months...

One additional advantage is that when learning from losses only, you don't have
to deal with blunders that distort learning. The computer makes no (tactical)
blunders. And a blundering opponent will not win against a computer. This (I
suspect) increases the probality that IF it learns something, it will be
something useful.

Ciao!
Bas.















This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.