Author: Anthony Cozzie
Date: 12:19:16 06/28/04
Go up one level in this thread
On June 28, 2004 at 15:06:59, Sune Fischer wrote: >On June 28, 2004 at 08:54:00, Anthony Cozzie wrote: > >>Most of you probably know by now that my "secret project" with Bob has been >>evaluation autotuning. I feel that with the improvements in parallel search and >>CPU speed, the evaluation is beginning to be become more important than the >>search in top programs (e.g. Shredder). As the number of terms increases, the >>number of relations increases quadratically and the problem becomes harder. I >>don't think that autotuning will ever be as good as hand tuning, but it is a >>good way to get a first guess for the value of a parameter, as well as seeing >>whether a new parameter is worth anything. This weekend, aside from watching >>massive amounts of "Hajime no Ippo", I have finished a run with the latest >>tweaks to the algorithm. >> >>The basic idea (which actually dates to my senior year in college, when I took >>CAD tools, 18760) is to use simulated annealing. Simulated annealing makes >>random changes and accepts them probablistically: > >I suspect it won't matter much whether you use SA, GA or some more primitive >gradient descent sceme. >If you initialize with good handtuned values it should fall directly into the >main attractor with a deep (global?) minimum. > >I think the core problem is to get good estimates for the real eval(). >You probably need millions of those estimates for SA, so playing a full game for >each "datapoint" does not seem like an option. > >-S. Genetic algorithms and comparison training work completely differently. Genetic algorithms: Make a random change. Play X games. If the modified version wins, keep the change. Otherwise don't. The problem with this is the huge number of games. A small eval change might require 500 games to get right; even at 1 minute per game that is still 20+ hours. Comparison Training: Get a bunch of GM games. Have the computer search each position, and find the "dominant position" (the position at the end of the PV). Compute the eval of the position after the best move vs the position after some other move. If the best move has a better score, then all is good. Otherwise penalize. Then I use the simulated annealing algorithm to decide whether to keep the change. I'm not asking for people to run my experiment, I am asking for people to see if the values produced do better or worse. anthony
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.