Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Knowledge again, but what is it?

Author: Fernando Villegas
Date: 16:11:55 02/25/98
On February 25, 1998 at 12:13:57, Don Dailey wrote:

>On February 25, 1998 at 05:09:41, Amir Ban wrote:
>
>>On February 25, 1998 at 00:15:59, Don Dailey wrote:
>>
>>>Well said Fernando,
>>>
>>>I don't think it's possible to separate the two concepts.  Search IS
>>>knowledge (one form of it.)   I think we tend to artificially separate
>>>the two concepts which is probably a mistake.  It's similar to chess
>>>players arguing over tactical vs positional play.  It's more a human
>>>concept than anything else.   I fear we do subject our programs to
>>>our own superstitions but cannot help ourselves.
>>>
>>
>>Actually I think it would be very interesting to separate the concepts,
>>but I don't see that we know how.
>
>I once actually tried to do something like this but
>was not successful.  I wanted to define positional
>vs tactical chess (even if arbitrarily) but there
>were several problems.  For one thing,  even the
>values of the pieces should be considered a very
>relevant piece of knowledge and already your search
>is knowledge based.  I consider a few other terms
>as purely material (like various piece cooperation
>terms like the bishop pair) but now this is even
>more knowledge.  The truth of the matter in my opinion
>is that it is futile to separate the two.  If I had
>to define PURE tactics I would say that it is only
>knowledge of win loss or draw as seen at end nodes in
>a search.   A quies search might not even be considered
>relevant in this kind of search.
>
>All of the chess concepts we read about in books and
>apply to our own chess (and to our programs) are in a
>sense purely artificial.  The pieces have no real
>fixed values but are estimates of winning chances.
>When checkmate happens it's completely irrelevant
>who has extra pieces, backward pawns etc.

That is precisely the question! Besides, to talk of "evaluation" instead
of knowledege does not solve the issue, no more than to explain that
knowledge refers to code lines. Well, yes, that's the operational
implementation, but it is obvious that in the evaluation code programmer
try to mimick the suposed eficacy and truth of Knowledge.  The more I
play chess, the more I see that the so called theoretical principles are
almost ever truth only when they are truims, but each time they go
beyonf the point of vulgar evidence, exceptions begins. Chess is most of
the time a kind of complicated structure where general principles does
have sense for the specifics of the moment. So, when an "stupid" fast
searcher gets the upper hand is because precisely he just see what
really happens without deviations produced by any kind of knowledege.
Theorical knoweledege, in chess as in elksewhere, is a shortcut derived
of our limitations; as much as we cannot see all in specific, we try to
guide ouirselves with general principles. But the, if cwe can see more
an evn exhaustively inside a determinarted field of problems, as many
times chess becomes, knowledge  becomes just a nuissance. The more
powerfull a chess player is, the less he thinks on terms of general
principles; nobody thinks more in rules that a medium, expert level
player; that's the reason he is just expert.
Fernando

>
>If you consider a purely tactical program (only knows
>wins loss and draw) that is capable of a few hundred
>plys of searching, you will see right away that our
>chess ideas and concepts are only designed for the
>human mind (and now it turns out usefull for chess
>program evaluation functions too!),  they have only
>abstract meaning.  This deep searches does not need them!
>
>I did plot my programs evaluation against actual win
>percentages a couple of years ago and found a very nice
>smooth curve.  The score was an excellent predictor of
>winning chances.  But as you say, this does not really
>tell me how good the evaluation function is.
>
>Here is a useful experiment someone could try:
>
>Plot this graph against thousands of positions with a known
>result, for example a large database of master games. Now take
>any single term (I'll use isolated pawn on open file with rooks
>present for my example) and throw out all positions with this
>feature existing.  Make the same plot with this subset of positions.
>
>Now examine the resulting plot compared against the first plot.
>I argue that the plot should match (with minor statistical error.)
>If they do not match, it's a good indication that your isolated pawns
>weight does not match well with your other evaluation.  For instance,
>if a score of  0.50 represents an 80% chance of winning, but only
>a 60% chance of winning when considering these isolated pawn
>positions then it indicates your isolated pawn value is too high
>(relative to other evaluation.)  In other words your position is
>not as good as your program thinks it is, but the isolated pawn
>stuff jacks it up too high.
>
>We have a student who wants to experiment with automated fine tuning
>based on this principle.  There are a lot of hairy issues like how
>to get a reasonably accurate sample space and other testing
>methodology issues but it might turn out interesting.
>
>Your question of how to compare evaluation functions without searching?
>Pehaps there is a way to compare degree of error.  An evaluation of
>zero or .5 probability all the time will predict win probability
>correctly
>as you state, but can be very wrong if you choose a sample of positions
>that are all wins! It seems that we have to start with a large set of
>positions
>that are absolutely known to be wins losses or draws.  THEN we can
>judge the evaluation function by it's total error.  Some type of fit
>needs to be made to the score so that the evaluation functions is
>returning
>an expenctancy of winning, and not a conventional score.  Then we can
>judge it on a
>position by position basis for total error.  In this way, simply giving
>an even evaluation score on every position should greatly increase it's
>error.  Of course how to construct a set like this is not so clear.
>
>- Don
>
>
>>When we talk about knowledge, we usually mean evaluation. (This is the
>>part that the user senses. I know that some people also call "knowledge"
>>all kinds of fancy stuff whose purpose is to guide and help the search
>>engine. I don't quite agree, but for argument's sake let's stick to
>>evaluation). So a "knowledge" program has good evaluation, better than a
>>non-knowledge program anyway.
>>
>>We all know what the BEST evaluation is. It's the one coming out of
>>perfect knowledge of the game. But what is good evaluation ? More
>>precisely, given two evaluation functions, how do you decide which is
>>better ?
>>
>>I think most people, after a moment's thought would say: Put a search
>>engine on top of both and let them play a match (or run a suite, if you
>>will). This is not satisfactory, for practical reasons, because some
>>engines will do better with certain kinds of evaluations, or will behave
>>differently at different time controls, and for theoretical reasons,
>>because you would expect that good (or better) evaluation is something
>>that can and should be defined without the parasite of searching. When
>>you have an "objectively good" evaluation, you would expect that any
>>engine would do well with it.
>>
>>Can anyone come forward with a way of comparing evaluation functions,
>>not necessarily a practical one, that does not involve searching ?
>>
>>Let me make a try at this: The evaluation of a position should be a
>>measure of the expected outcome of the game (with assumed perfect play),
>>i.e. it can be mapped to a probability of winning. Say, with a score of
>>+0.5 you expect to win 65%, and with a score of -4 you expect to win
>>0.6%. The probability for winning should be monotonic with the score, or
>>else something is bad with the function. So one way to define a better
>>evaluation is if it is more monotonic. You can also actually decide on
>>score-to-probability mapping with some exponential say, and declare that
>>the better evaluation is the one that fits better the mapping.
>>
>>I think this definition is on the right track, but there is something
>>clearly wrong with it: First, there is an almost infinite number of such
>>functions that would give you a perfect fit, but most of them are
>>nonsense. For example, an evaluation function that always returns 0 is
>>perfect in this sense, but obviously useless. A less extreme example is
>>an evaluation that limits itself to score from +0.5 to -0.5, but does
>>that perfectly. This is a perfect but wishy-washy evaluation, so not
>>very useful, because practically you need to know that capturing the
>>queen gives you 99.9% win, and this evaluation never guarantees more
>>than say 70%. Of course, if you are already a piece ahead, this
>>evaluation gives you no guidance at all.
>>
>>I'm sure this definition can be improved on, but currently I don't know
>>how.
>>
>>Amir
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.