Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Knowledge again, but what is it?

Author: Jay Scott

Date: 14:23:52 02/25/98

Go up one level in this thread



On February 25, 1998 at 05:09:41, Amir Ban wrote:
>Let me make a try at this: The evaluation of a position should be a
>measure of the expected outcome of the game (with assumed perfect play),
>i.e. it can be mapped to a probability of winning. Say, with a score of
>+0.5 you expect to win 65%, and with a score of -4 you expect to win
>0.6%. The probability for winning should be monotonic with the score, or
>else something is bad with the function. So one way to define a better
>evaluation is if it is more monotonic. You can also actually decide on
>score-to-probability mapping with some exponential say, and declare that
>the better evaluation is the one that fits better the mapping.
>
>I think this definition is on the right track, but there is something
>clearly wrong with it: First, there is an almost infinite number of such
>functions that would give you a perfect fit, but most of them are
>nonsense. For example, an evaluation function that always returns 0 is
>perfect in this sense, but obviously useless. A less extreme example is
>an evaluation that limits itself to score from +0.5 to -0.5, but does
>that perfectly. This is a perfect but wishy-washy evaluation, so not
>very useful, because practically you need to know that capturing the
>queen gives you 99.9% win, and this evaluation never guarantees more
>than say 70%. Of course, if you are already a piece ahead, this
>evaluation gives you no guidance at all.

I think this is a different topic, so I replied separately.

It's like the weather report. If it says there's a 30% chance
of rain tomorrow, then 30% of the time it rains (it's true!).
Meteorologists are said to be "well-calibrated" in that sense.
Ideally you'd like to hear either that there's a 100% chance
of rain or a 0% chance, but in practice if they did that then
they'd be wrong a lot of the time.

A chess program is in an exactly analogous position. It would
like to be as discriminating as possible while staying well-
calibrated.

I don't think it's a serious problem in practice. You can
measure everything from game data.

There is another degenerate case: the evaluation which says that
every position is a loss. A program that uses it will probably
find that it is accurate. :-) This is a practical difficulty in
temporal difference learning, though it's not too hard to work
around.

  Jay



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.