Computer Chess Club Archives


Search

Terms

Messages

Subject: calibrating the evaluation function

Author: Jay Scott

Date: 04:55:18 07/31/98

Go up one level in this thread



On July 31, 1998 at 01:51:57, blass uri wrote:

>The programs I know give me evaluation in pawns and I prefer to see
>in the evaluation function the predicted result of the game(number between 0
>and 1) and not an evaluation in pawns.

For my part, I'd prefer a probability distribution giving the chances
of a win, loss or draw. But the chess programmers don't seem to have
any plans for it. What you are asking for can be called the equity of
the position. It's (probability of win) + 0.5 * (probability of draw),
if you assume that a draw is worth 0.5. (In a tournament or match, the
value of a draw may be more or less than 0.5, depending on the
tournament or match situation.)

You can use Komputer Korner's table (from his posting in this thread)
to get a rough idea of how to convert a score from pawns to equity.
However, it may be that every program is different. If so, you'll have
to calibrate each program you're interested in separately.

One way to estimate a program's score->equity conversion is by
having the program play a lot of games against itself (it should be
against an equal opponent, and what opponent is more equal than
itself?). Divide the range of scores into intervals, maybe 0-0.2,
0.201-0.3, etc. For each interval, count up the number of times that
a score in that interval occurred in won, lost, and drawn games.
Then you know what a score in that interval means.

You need a lot of games to make the statistics valid. It would be nice
to automate the process. For example, to do it with crafty you'd
like to write a program that reads crafty's log and adds up all
the numbers.

I'd like to recommend this exercise to chess programmers as a way
to test the meaning and validity of their evaluation functions.
You can also use it to examine individual evaluation factors.
For example, if you're wondering about your two-bishops bonus,
you can run the numbers only for positions where one side has
the advantage of two bishops. If the bonus is too big, you should
expect to see a flatter curve, or a shifted curve, as the score goes
up more than the chance of winning. You're less likely to see an
effect if the bonus is too small, because the side with two bishops
will be willing to give them up without taking full advantage of them.

Summary: You get more information from the detailed behavior of the
evaluation function in test games than you get from only the results
of the games.

  Jay



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.