Author: jonathan Baxter
Date: 17:23:11 10/01/98
Go up one level in this thread
On September 30, 1998 at 13:23:27, Don Beal wrote: >On September 29, 1998 at 21:42:15, Danniel Corbit wrote: >> >>Has anyone a pointer to research on tuning evaluation functions using least >>squares fitting or gradient methods? > >You might like to look at Temporal Difference methods, in particular >the TD-lambda method proposed by Sutton in 1988. This was used by >Tesauro to make the world-championship class backgammon player Neurogammon. >It's neither least-squares fitting, nor simple gradient descent, but it >does utilise gradients internally to make small adjustments to weights >after each move. It can take thousands of games to reach good values, >but they can be speed games. > >One paper describing the technique is: >"Learning Piece Values by Temporal Difference learning" >Beal & Smith, ICCA Journal, Vol 20, No 3, Sept 1997. Another paper describing the technique is: "Experiments in Parameter Learning using Temporal Differences" Baxter, Tridgell and Weaver, ICCA 21 (2) June 1998. In that paper we report an experiment in which our program "KnightCap" learnt from a 1600 rating (on FICS) to a 2150 rating in just 3 days and 308 games of blitz play. Note that we started out with the material parameters already set to "1 4 4 6 12" for PNBRQ, but the rest of the parameters were set to zero. You can get the source code for KnightCap from http://wwwsyseng.anu.edu.au/lsg/knightcap.html (you have the option to turn learning on or off). You can also get an earlier version of the ICCA paper there. Currently I am working on an improved version of KnightCap, with better search and a much *simpler* evaluation function. Interesting preliminary results show that with parameter learning and *just* piece/square tables we can get a rating of around 2500 on ICC. This is without any opening learning (the original version of KnightCap has an opening learning algorithm that is also described in the above paper.) Another interesting observation is that the material values, although they start out at 1 4 4 6 12, evolve towards approximately 1.25 3.9 3.9 6 12 which is much closer to the traditional 1 3 3 5 9 (if you rescale so a pawn is 1). For those who are interested, I have included all the final piece/square tables below, after learning for a few hundred games against crafty. > The method works for any weights, not just piece values, provided >that the evaluation consists of a sum of term*weight components. This isn't true. You can use the method for any evaluation function provided it is a differentiable function of its weights. This includes linear evaluation functions, but is not restricted to them. Cheers, Jonathan Baxter ----------------------- Piece Square tables learnt from a couple of hundred crafty games. ----------------------------------------------------------------- Note that the board reads left to right *not* top to bottom. So the left hand column is the first *rank* and the right hand column is the last rank. The first *row* is the A file and the last row is the H file. const etype orig_coefficients[] = { /* IPIECE_VALUES */ 0, 124, 390, 391, 597, 1205, 8000, These started out at 100 400 400 600 1200 8000. The king value (8000) is never alterred (if its gone its checkmate :-) The zero at the start is just a side effect of the way the parameters are used (it doesn't mean anything). The following values all started out at zero. /* IPAWN_POS */ 0, -1, -1, 0, -1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, -2, 1, 0, -2, 1, 1, 0, 0, -4, 1, 2, 2, 1, 1, 0, 0, -5, -1, -1, 1, 1, 0, 0, 0, 2, -1, 1, 1, 1, 0, 0, 0, 4, 3, 2, 1, 1, 0, 0, 0, 0, -1, 1, 1, 0, 0, 0, Remember the board is sideways. Note the high values for pawns in front of the king side castled king(2, 4, 0). And hang on to that pawn on b2 as well (for queen side castling). Get those pawns off d2 and e2 (-4, -5) and get those pawns up the board (since there were no seperate tables for the different game stages these values have to work for the whole game, including the ending). /* IKNIGHT_POS */ 0, 0, -1, 0, 0, 0, 0, 0, -4, 0, 0, -1, 0, 0, 0, 0, 0, -1, -1, 0, 0, 0, 0, 0, -1, 0, 0, 0, -1, 0, 0, 0, -1, -2, 0, 0, 1, 0, 0, 0, -2, 0, 1, 0, 0, 0, 0, 0, -4, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, Big message here: develop those Knights! (they get a -4 penalty for sitting on their original squares). /* IBISHOP_POS */ 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, -4, 0, 0, 0, 0, 0, 0, 0, 0, -2, 0, 0, 1, 0, 0, 0, 0, -1, 1, 2, 0, 0, 0, 0, -5, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, Again, develop those bishops... /* IROOK_POS */ -3, -1, -1, 0, 0, 0, 0, 1, -3, 0, 0, 0, 0, 0, 0, 0, -2, 0, 0, 0, 0, 0, 1, 0, -1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, -2, 0, 0, 0, 0, 0, 0, 0, -3, 0, 0, 0, 0, 0, 1, 0, -5, 0, 0, 0, 0, 0, 0, 0, Get those rooks off their satrting squares and onto the d and e-files (another way to encourage castling). And that 7th rank is pretty juicy too. /* IQUEEN_POS */ 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 2, -1, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, Get the queen off her starting square and onto e2, f3 or c2 (???) /* IKING_POS */ 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, -1, 0, 0, 0, 0, 0, 0, -3, -2, 0, 0, 0, 0, 0, 0, 3, -1, 0, 0, 0, 0, 0, 0, -2, 0, 0, 0, 0, 0, 0, 0, CASTLE THAT KING!! (big value for g1 and c1 and b1.) This is good in the opening but screws up in the ending (KnightCap will keep its king glued to that g1 square unless it can see a big pawn push advantage :-) };
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.