Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Optimizing a Chess Program's Settings Painlessly

Author: Stuart Cracraft

Date: 10:37:28 07/11/04

Go up one level in this thread


On July 10, 2004 at 23:56:37, Robert Hyatt wrote:

>On July 10, 2004 at 15:55:58, Stuart Cracraft wrote:
>
>>On July 10, 2004 at 12:07:40, Robert Hyatt wrote:
>>
>>>On July 10, 2004 at 11:20:46, Stuart Cracraft wrote:
>>>
>>>>On July 09, 2004 at 08:52:25, Robert Hyatt wrote:
>>>>
>>>>>On July 09, 2004 at 07:16:41, Volker Böhm wrote:
>>>>>
>>>>>>On July 08, 2004 at 11:44:32, Robert Hyatt wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>One fatal flaw.  This will produce a set of values that will optimize your
>>>>>>>results against a test set.  But that's not the same thing as producing a set of
>>>>>>>values that will optimize your results in actual OTB games.
>>>>>>>
>>>>>>>This is a common mistake.
>>>>>>
>>>>>>Have you got an idea how to automatically optimize settings for OTB games?
>>>>>
>>>>>I have tried the test set route.  It simply didn't work.  I have altered the
>>>>>settings to produce better results on ICC, and hurt test set results.  My search
>>>>>parameters are tunable by simple commands to crafty, so it is easy to automate a
>>>>>big test set and vary each parameter over some range.  There was definitely one
>>>>>or two overall combinations that were best for test positions, but not for OTB
>>>>>play...
>>>>>
>>>>>
>>>>>>
>>>>>>Currently I need about 2 days on two computers to test if one setting is better
>>>>>>than another (with acceptable low error rate). The following experiences makes
>>>>>>things hard:
>>>>>>
>>>>>>1. You can only find settings that are at least 5% better (gets 5% more points)
>>>>>>by testing. Optimizations below 5% will need to much games to give a
>>>>>>statistically "proven" (I use "it is better with a probability of 95% or more")
>>>>>>result.
>>>>>>2. Even for a "5%" better you need about 200 test-games.
>>>>>>3. The result will differ with different time-control. I ignore this problem
>>>>>>currently.
>>>>>
>>>>>There are several setup parameters that will cause problems.  You just pointed
>>>>>out one, the time control.  Blitz is different from longer games and the
>>>>>parameters will likely be different.  Of course this is an important "detail"
>>>>>that an engine should deal with, but I do not myself.  But I can see where it
>>>>>would be good to tune for the time control, somewhat like my "adaptive hash"
>>>>>tunes the hash size for the time control, automatically...
>>>>>
>>>>>You also can't ignore the opponent.  I'd expect to find different parameter sets
>>>>>for different opponents would be more optimal as well.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>4. The result will differ with different opponents. I user a set of opponents.
>>>>>>
>>>>>>Thus optimization is really hard work for me!
>>>>>>
>>>>>>Greetings Volker
>>>>>
>>>>>
>>>>>
>>>>>It's hard for everyone. :)
>>>>
>>>>Automate it!
>>>>
>>>>Make everything in your program easily changeable by the program
>>>>itself -- so everything is a variable indexable by a single number
>>>>and that expands out to an entirely different opponent.
>>>
>>>Already done...
>>>
>>
>>Outstanding!
>>
>>>>
>>>>Then let it loose on ICS/FICS with some method to keep those values
>>>>that contribute to good moves and wins and discard or change those
>>>>values that don't. Many methods for autotuning exist. It's been pretty well
>>>>understood that in chess using just the end-of-game result won't help tune it
>>>>very quickly as compared to per-move results.
>>>
>>>
>>>That's the problem.  Figuring out which are good and which are bad...
>>>
>>
>>Have you seen Whitwell and Kendall's paper on Evolutionary Computing?
>
>Yes.  And others from TD learning back to Frey's book with the othello learning
>stuff and lots of others in between.  But the problem is not so easy.  IE if you
>test against GM positions, suppose your program doesn't normally reach those
>positions?  Then the tuning fails.  What if the GM games have significant
>numbers of weak/bad moves (Yes, GM players make 'em also)?
>
>It doesn't seem to be obvious to me how to handle this...

Why did the GM regression-generated eval work for Deep Blue and Deep Thought
I/II?



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.