Author: Tom Kerrigan
Date: 04:44:55 12/29/01
Go up one level in this thread
On December 29, 2001 at 05:21:21, Rafael Andrist wrote: >On December 29, 2001 at 05:11:26, Tom Kerrigan wrote: > >>Here's my idea. >> >>You have a position and you want your program to play a certain move (which it >>presumably isn't playing). You run this algorithm: >> >>1. Search the position, get a PV. The evaluation of the last position of the PV >>is eval(1). >>2. Search only the move that you want your program to make, get a PV. This >>end-point evaluation is eval(2). >>3. Figure out which eval terms are different between eval(1) and eval(2). >>Decrease the weights of all the different eval(1) terms slightly. Increase the >>eval(2) terms slightly. >>4. Repeat until the program plays the move you want. >> >>You could run this on lots of positions from GM games, to get your program to >>play like a GM. (At least in some positions, heh.) >> >>Has this been done before? Are there any glaring problems with this idea? Does >>anybody want to try this? If so, I'd like some credit for it. If not, I'll >>probably get around to trying it sometime... >> >>-Tom > >You may be interested in the temporal difference learning algo by Richard Sutton >which is implemented in the program KnighCap by Andrew Tridgell and Jonathan >Baxter. >http://www.syseng.anu.edu.au/lsg/knightcap.html Thank you. They've done some terrific work. My idea is different from theirs, though, I believe. To make broad generalizations, their goal is to make the eval function a better predictor of the future, while my goal is to make it produce known-good moves. Different means to the same end, presumably... :) -Tom
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.