Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Tuning evaluation function weights

Author: jonathan Baxter
Date: 17:23:11 10/01/98
On September 30, 1998 at 13:23:27, Don Beal wrote:

>On September 29, 1998 at 21:42:15, Danniel Corbit wrote:
>>
>>Has anyone a pointer to research on tuning evaluation functions using least
>>squares fitting or gradient methods?
>
>You might like to look at Temporal Difference methods, in particular
>the TD-lambda method proposed by Sutton in 1988.  This was used by
>Tesauro to make the world-championship class backgammon player Neurogammon.
>It's neither least-squares fitting, nor simple gradient descent, but it
>does utilise gradients internally to make small adjustments to weights
>after each move.  It can take thousands of games to reach good values,
>but they can be speed games.
>
>One paper describing the technique is:
>"Learning Piece Values by Temporal Difference learning"
>Beal & Smith, ICCA Journal, Vol 20, No 3, Sept 1997.

Another paper describing the technique is:
"Experiments in Parameter Learning using Temporal Differences"
Baxter, Tridgell and Weaver, ICCA 21 (2) June 1998.

In that paper we report an experiment in which our program "KnightCap" learnt
from a 1600 rating (on FICS) to a 2150 rating in just 3 days and 308 games of
blitz play. Note that we started out with the material parameters already set
to "1 4 4 6 12" for PNBRQ, but the rest of the parameters were set to zero.
You can get the source code for KnightCap from
http://wwwsyseng.anu.edu.au/lsg/knightcap.html (you have the option to turn
learning on or off). You can also get an earlier version of the ICCA paper
there.

Currently I am working on an improved version of KnightCap, with better
search and a much *simpler* evaluation function. Interesting preliminary results
show that with parameter learning and *just* piece/square tables we can get
a rating of around 2500 on ICC. This is without any opening learning (the
original version of KnightCap has an opening learning algorithm that is also
described in the above paper.)

Another interesting observation is that the material values, although
they start out at 1 4 4 6 12, evolve towards approximately 1.25 3.9 3.9 6 12
which is much closer to the traditional 1 3 3 5 9 (if you rescale so a pawn is
1).

For those who are interested, I have included all the final piece/square tables
below, after learning for a few hundred games against crafty.


> The method works for any weights, not just piece values, provided
>that the evaluation consists of a sum of term*weight components.

This isn't true. You can use the method for any evaluation function provided
it is  a differentiable function of its weights. This includes linear evaluation
functions, but is not restricted to them.


Cheers,

Jonathan Baxter


-----------------------

Piece Square tables learnt from a couple of hundred crafty games.
-----------------------------------------------------------------

Note that the board reads left to right *not* top to bottom. So the left hand
column is the first *rank* and the right hand column is the last rank. The first
*row* is the A file and the last row is the H file.

const etype orig_coefficients[] = {
/* IPIECE_VALUES */
      0,    124,    390,    391,    597,   1205,   8000,

These started out at 100 400 400 600 1200 8000. The king value (8000) is
never alterred (if its gone its checkmate :-) The zero at the start is just
a side effect of the way the parameters are used (it doesn't mean anything).

The following values all started out at zero.

/* IPAWN_POS */
      0,     -1,     -1,      0,     -1,      0,      1,      0,
      0,      1,      0,      0,      0,      0,      0,      0,
      0,     -2,      1,      0,     -2,      1,      1,      0,
      0,     -4,      1,      2,      2,      1,      1,      0,
      0,     -5,     -1,     -1,      1,      1,      0,      0,
      0,      2,     -1,      1,      1,      1,      0,      0,
      0,      4,      3,      2,      1,      1,      0,      0,
      0,      0,     -1,      1,      1,      0,      0,      0,

Remember the board is sideways. Note the high values for pawns in front
of the king side castled king(2, 4, 0).
And hang on to that pawn on b2 as well (for
queen side castling). Get those pawns off d2 and e2 (-4, -5) and get those
pawns up the board (since there were no seperate tables for the different game
stages these values have to work for the whole game, including the ending).

/* IKNIGHT_POS */
      0,      0,     -1,      0,      0,      0,      0,      0,
     -4,      0,      0,     -1,      0,      0,      0,      0,
      0,     -1,     -1,      0,      0,      0,      0,      0,
     -1,      0,      0,      0,     -1,      0,      0,      0,
     -1,     -2,      0,      0,      1,      0,      0,      0,
     -2,      0,      1,      0,      0,      0,      0,      0,
     -4,      0,      0,      0,      1,      0,      0,      0,
      0,      0,      0,      0,      0,      0,      0,      0,


Big message here: develop those Knights! (they get a -4 penalty for sitting on
their original squares).

/* IBISHOP_POS */
      0,      0,     -1,      0,      0,      0,      0,      0,
      0,      0,      0,      0,      1,      0,      0,      0,
     -4,      0,      0,      0,      0,      0,      0,      0,
      0,     -2,      0,      0,      1,      0,      0,      0,
      0,     -1,      1,      2,      0,      0,      0,      0,
     -5,      0,      0,      1,      0,      0,      0,      0,
      0,      1,      0,      0,      0,      0,      0,      0,
      0,      0,      0,      0,      0,      0,      0,      0,


Again, develop those bishops...


/* IROOK_POS */
     -3,     -1,     -1,      0,      0,      0,      0,      1,
     -3,      0,      0,      0,      0,      0,      0,      0,
     -2,      0,      0,      0,      0,      0,      1,      0,
     -1,      0,      0,      1,      0,      0,      1,      0,
      0,      0,      1,      0,      0,      0,      1,      0,
     -2,      0,      0,      0,      0,      0,      0,      0,
     -3,      0,      0,      0,      0,      0,      1,      0,
     -5,      0,      0,      0,      0,      0,      0,      0,

Get those rooks off their satrting squares and onto the d and e-files (another
way to encourage castling). And that 7th rank is pretty juicy too.


/* IQUEEN_POS */
      0,      0,     -1,      0,      0,      0,      0,      0,
      0,      0,     -1,      0,      0,      0,      0,      0,
      0,      2,     -1,      0,      0,      0,      0,      0,
     -1,      0,      0,      0,      0,      0,      0,      0,
      0,      3,      0,      0,      0,      0,      0,      0,
      0,      0,      2,      0,      0,      0,      0,      0,
      0,      0,      0,      0,      0,      0,      0,      0,
      0,      0,      0,      0,      0,      0,      0,      0,


Get the queen off her starting square and onto e2, f3 or c2 (???)

/* IKING_POS */
      0,      0,      0,      0,      0,      0,      0,      0,
      3,      0,      0,      0,      0,      0,      0,      0,
      1,     -1,      0,      0,      0,      0,      0,      0,
     -1,      0,      0,      0,      0,      0,      0,      0,
     -1,     -1,      0,      0,      0,      0,      0,      0,
     -3,     -2,      0,      0,      0,      0,      0,      0,
      3,     -1,      0,      0,      0,      0,      0,      0,
     -2,      0,      0,      0,      0,      0,      0,      0,


CASTLE THAT KING!! (big value for g1 and c1 and b1.) This is good in the
opening but screws up in the ending (KnightCap will keep its king glued
to that g1 square unless it can see a big pawn push advantage :-)

};
Re: Tuning evaluation function weights Dave Gomboc 22:00:57 10/02/98
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.