Computer Chess Club Archives




Subject: Re: Automatic Eval Tuning

Author: Vincent Diepeveen

Date: 04:44:15 07/05/01

Go up one level in this thread

On July 04, 2001 at 09:12:35, JW de Kort wrote:

>On June 29, 2001 at 14:05:06, Jon Dart wrote:
>>On June 29, 2001 at 11:14:34, Artem Pyatakov wrote:
>>>I am curious, have people here experimented or extensively used Eval Function
>>>tuning based on GM games for example?
>>>If so, is it effective to any extent?
>>There is a program called KnightCap that implemented eval
>>learning and it worked quite well. Source is available.
>Hi i tried to understand what they are doing by reading a paper on their site.
>But being a tax lawyer my mathemetics bothered me. Can someone please explain
>their methode in language that is understandable to some one with only high
>school mathematics.
>Jan Willem

Oh well scientific world is going to kill me (the AI part),
if i say this, but all those learning things are closest to
next thing:
  In a program there are values, for example open file might
  get 0.20 pawn bonus.

All those bonuses and penalties we call parameters.

Automatic tuning, whatever form more or less is going to randomly
guess a different value for every parameter.

Then the program plays a few games, or most likely only does a few
tests. Based upon the result of the test then gets decided whether this
tuning is better or worse.

Obvious problems
  - if the way of testing is very primitive, so will the conclusions
    about a parameter set be
  - less obvious, but hard reality is that from game playing viewpoint
    the best versions of diep always were versions which did less well
    on the testsets. Too aggressive evaluation tuning solves of course
    way more positions but there is every move perhaps 50% chance that
    you play a patzer move and lose the game because of that, whereas
    a more positional alternative is more likely to winning the game
  - No independant conclusions get drawn like humans can.
    Like we watch a program play and say: "he it is getting way too much
    points for open files, it goes to the open files quick but neglects
    the rest of the play!"
  - So if there is 1 parameter in an evaluation,
    then the automatic learning needs
    to try all values for that 1 paramater before it knows whether it's
    Suppose we have 2001 values for 1 parameter (-1000 to +1000), then
    that's 2001 experiments. Now for 2 parameters it would need
    2001 x 2001 experiments.
    Obviously for 100 parameters it needs 2001^100 then
    for 1000 it needs 2001^10000. That's a number which will not fit
    soon on your paper so many zero's it has.
    In short automatic learnings weak point is the huge number of experiments
    needed. Obviously every researcher in automatic tuning focuses upon
    methods to get that number of experiments as
    small as possible.

    Till today however no one succeeded in getting it lineair. To do that
    you need intelligence and domain dependant knowledge.

    That last must not be underestimated, but the first still gets hugely
    underestimated by non-scientists.

    The learning program is usually using very simplistic algorithms to
    get to its numbers.

    The average chessprograms search is usually way smarter as the most
    complicated learning which is producing results somehow.

    The trick 100% of all researchers use to show their learning works
    is to compare with the most insane parameter set so called 'handpicked'.

    If i would only tune for a day, then let the automatic tuning of knightcap
    learn and learn and learn for over a year or 10. Then play
    knightcap-diepeveen tuned versus knightcap-selftuned,
    then we would see a huge score difference. Of course if i were
    the knightcap programmer i would start with a bonus for doubled pawns
    and a penalty for putting rooks to open files. In that way i could
    conclude that learning works.

    This is how 100%, can be 99.9% too, of the conclusions are about

    Way too optimistic. It's really a childish thing!

    Now it's unfair to say something bigtime negative about knightcap,
    because the level of this experiment is already way above other
    tuning experiments. The average tuning experiment is way smaller setted
    up, and if such a childish average experiment gets filmed by discovery
    channel or whatever, then they also directly conclude that in future
    robots are going to take over.

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.