Computer Chess Club Archives


Search

Terms

Messages

Subject: TDLeaf evaluation learning in EXchess

Author: Dan Homan

Date: 16:28:19 07/01/00


I keep meaning to retire EXchess and write an engine with some
more advanced techniques like bitboards or 0x88, but I always
end up coming back to EXchess to try "one more thing".

The "one more thing" that I have been meaning to try for a while
is Temporal Difference evaluation learning (e.g. knightcap) because
it seemed simple to implement and I hate to manually tune my evaluation
... so much so that I haven't spent more than a few hours on tuning
the parameters.

Now I can report my first impressions with TD learning.  I sat down
this week and spent a few hours on a couple of different nights putting
it into my program and debugging (which took most of the time).  I can say
that the TD learning code (at least what I've done so far) is not much more
difficult to implement than tablebase probing.  The knightcap guys really
lay it all out very well in their published papers.

My code is still messy and there are some issues I still need to work
on.  With that said, I did try it out on the piece values in my program.
I started the values at

PAWN = 100  (fixed)
KING = 10000  (meaningless, so learning should not affect it)
KNIGHT = 0
BISHOP = 0
ROOK = 0
QUEEN = 0

and played a 100 game, 1 0 lightning match against GNU-chess.  After the
match, the parameters were at

PAWN = 100
KING = 10000
KNIGHT = 296
BISHOP = 327
ROOK = 508
QUEEN = 997

Which are pretty close to my "hand-tuned" values.  I did interfere with
the match after game 25 to reduce the learning rate from a mean of 100
points per update to a mean of 10 points per update.  (The actual update
to the piece scores depends on the details of the given game and could
be a few times the mean or a modest fraction of the mean).  My decision
of when to reduce the learning rate probably influenced the final values,
but I am not certain by how much... after 25 games the values were within
a 100 points of the ones above but with lots of noise from game to game.

This has been a fun little project and I have lots of issues to examine
and improve in my particular implementation, but it seems to work and
rather well at that.  Eventually this stuff will end up in a future
release of EXchess, but probably not for several months at least.

If any programmers are interested, I can discuss rather generally the
types of changes I needed to make to my program to get this to work.

 - Dan



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.