Computer Chess Club Archives




Subject: Re: learning to tune parameters by comp-comp games

Author: Dann Corbit

Date: 12:35:19 12/28/00

Go up one level in this thread

On December 28, 2000 at 15:18:54, Uri Blass wrote:

>How much rating can programs earn by playing against themselves?
>I think that it is possible to improve the rating of programs by playing a lot
>of games between the program and itself when you change one parameter(for
>example increasing the value of pawn by 5%).
>It is possible to play a lot of games and stop only when there is a difference
>of 70 in order to learn if increasing the value of pawn by 5% is a good change
>or a bad change(we need big difference because the difference from small change
>is usually small and we can get often wrong results if we stop only at small
>If you find that increasing the value of pawn by 5% is productive you do the
>change and the program learned to increase the value of pawn.
>After it you continue in doing similiar tests.
>I think that programmers need a lot of beta testers in order to do all these
>tests and the question is what is the size of the improvement that you can get
>by these tests.
>I know that people can claim that you can improve the program in playing against
>itself when you do not improve it against other programs but I believe that most
>of the improvement is an improvement against other programs (at least in cases
>when the decision of the programmer is to do symmetric evaluation).
>The interesting question is how much improvement programmers can get by this way
>if they have enough money to pay for beta testers so they can get enough games.
>Other interesting questions are if there are examples when the same evaluation
>change is productive in 1 minute per game and counter productive in longer time
>control and if there are examples when A beats B, B beats C but C beats A(I mean
>when all the results are significant results).

Programs that improve their evaluation function upon play {and come with source
code} are:
Chessterfield (good implementation -- program has two forms -- learner and
BACE (good implementation, learns as it plays)
SAL {ick, but at least it compiles OK and you can have fun beating it}

In addition, the source code and game database for the Deep Blue team learning
function is available at Tim Mann's site.

I think the Chessterfield idea is a good one.  Have one form of your program
that learns as it plays.  Write the data out, and have the other form of your
program use the learned information to evaluate better.  Obviously, if you are
reexamining your evaluation function as you play, that will detract from the
engine strength.

This page took 0.04 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.