Author: Stuart Cracraft
Date: 10:37:28 07/11/04
Go up one level in this thread
On July 10, 2004 at 23:56:37, Robert Hyatt wrote: >On July 10, 2004 at 15:55:58, Stuart Cracraft wrote: > >>On July 10, 2004 at 12:07:40, Robert Hyatt wrote: >> >>>On July 10, 2004 at 11:20:46, Stuart Cracraft wrote: >>> >>>>On July 09, 2004 at 08:52:25, Robert Hyatt wrote: >>>> >>>>>On July 09, 2004 at 07:16:41, Volker Böhm wrote: >>>>> >>>>>>On July 08, 2004 at 11:44:32, Robert Hyatt wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>>One fatal flaw. This will produce a set of values that will optimize your >>>>>>>results against a test set. But that's not the same thing as producing a set of >>>>>>>values that will optimize your results in actual OTB games. >>>>>>> >>>>>>>This is a common mistake. >>>>>> >>>>>>Have you got an idea how to automatically optimize settings for OTB games? >>>>> >>>>>I have tried the test set route. It simply didn't work. I have altered the >>>>>settings to produce better results on ICC, and hurt test set results. My search >>>>>parameters are tunable by simple commands to crafty, so it is easy to automate a >>>>>big test set and vary each parameter over some range. There was definitely one >>>>>or two overall combinations that were best for test positions, but not for OTB >>>>>play... >>>>> >>>>> >>>>>> >>>>>>Currently I need about 2 days on two computers to test if one setting is better >>>>>>than another (with acceptable low error rate). The following experiences makes >>>>>>things hard: >>>>>> >>>>>>1. You can only find settings that are at least 5% better (gets 5% more points) >>>>>>by testing. Optimizations below 5% will need to much games to give a >>>>>>statistically "proven" (I use "it is better with a probability of 95% or more") >>>>>>result. >>>>>>2. Even for a "5%" better you need about 200 test-games. >>>>>>3. The result will differ with different time-control. I ignore this problem >>>>>>currently. >>>>> >>>>>There are several setup parameters that will cause problems. You just pointed >>>>>out one, the time control. Blitz is different from longer games and the >>>>>parameters will likely be different. Of course this is an important "detail" >>>>>that an engine should deal with, but I do not myself. But I can see where it >>>>>would be good to tune for the time control, somewhat like my "adaptive hash" >>>>>tunes the hash size for the time control, automatically... >>>>> >>>>>You also can't ignore the opponent. I'd expect to find different parameter sets >>>>>for different opponents would be more optimal as well. >>>>> >>>>> >>>>> >>>>> >>>>>>4. The result will differ with different opponents. I user a set of opponents. >>>>>> >>>>>>Thus optimization is really hard work for me! >>>>>> >>>>>>Greetings Volker >>>>> >>>>> >>>>> >>>>>It's hard for everyone. :) >>>> >>>>Automate it! >>>> >>>>Make everything in your program easily changeable by the program >>>>itself -- so everything is a variable indexable by a single number >>>>and that expands out to an entirely different opponent. >>> >>>Already done... >>> >> >>Outstanding! >> >>>> >>>>Then let it loose on ICS/FICS with some method to keep those values >>>>that contribute to good moves and wins and discard or change those >>>>values that don't. Many methods for autotuning exist. It's been pretty well >>>>understood that in chess using just the end-of-game result won't help tune it >>>>very quickly as compared to per-move results. >>> >>> >>>That's the problem. Figuring out which are good and which are bad... >>> >> >>Have you seen Whitwell and Kendall's paper on Evolutionary Computing? > >Yes. And others from TD learning back to Frey's book with the othello learning >stuff and lots of others in between. But the problem is not so easy. IE if you >test against GM positions, suppose your program doesn't normally reach those >positions? Then the tuning fails. What if the GM games have significant >numbers of weak/bad moves (Yes, GM players make 'em also)? > >It doesn't seem to be obvious to me how to handle this... Why did the GM regression-generated eval work for Deep Blue and Deep Thought I/II?
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.