Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: On clearing book learning?

Author: Robert Hyatt
Date: 12:45:08 01/14/04
On January 14, 2004 at 15:01:33, Bob Durrett wrote:

>On January 14, 2004 at 14:07:26, Robert Hyatt wrote:
>
>>On January 14, 2004 at 12:58:53, Bob Durrett wrote:
>>
>>>On January 14, 2004 at 12:43:25, Robert Hyatt wrote:
>>>
>>>>On January 14, 2004 at 11:32:11, Bob Durrett wrote:
>>>>
>>>>>On January 14, 2004 at 09:24:06, Robert Hyatt wrote:
>>>>>
>>>>>>On January 14, 2004 at 06:36:38, Chris Taylor wrote:
>>>>>>
>>>>>>>Why, what are the for and against?
>>>>>>>I play mainly auto232; and have quite a lot of games.  As a result, had I not
>>>>>>>cleared learning, what would this mean.
>>>>>>
>>>>>>What will happen is that your book will dry up and go away.  You _must_ lose
>>>>>>a game with nearly every opening at some point in time, and without some
>>>>>>caution,
>>>>>>that will make that line unplayable.  And once it is unplayable, there is no way
>>>>>>for it to become playable again, without your intervention.
>>>>>
>>>>>Perhaps it not is not merely a joke to characterize Crafty's automatic opening
>>>>>book learning as:  "Snip, snip, snip!"
>>>>
>>>>Correct. as well as _any_ program that uses "result learning".  IE Fritz.
>>>>When it loses, it flags part of that line with "don't ever play again."
>>>>
>>>>>
>>>>>If, instead of outright snipping, the probabilities were merely adjusted by some
>>>>>small amount then the phenomena you describe might not happen, or at least not
>>>>>as often.  It seems to be a matter of stability.  If there is no human
>>>>>intervention then things could go terribly wrong!
>>>>
>>>>
>>>>Crafty does both.  IE it can just learn by playing the game and adjusting the
>>>>book after it has been out a while, and looking at the scores.  Or it does
>>>>result learning so that if it wins it will play it again, and if it loses,
>>>>it will not.
>>>>
>>>>You can select either/or/both, but using result learning has the catch-all
>>>>glitch I mentioned.
>>>>
>>>>
>>>>
>>>>>
>>>>>Bob H., in a previous bulletin you provided a "thumbnail sketch" of the criteria
>>>>>Crafty uses to determine whether or not to "snip."  If I understood it
>>>>>correctly, you said that the engine monitored the position evaluation scores as
>>>>>soon as the program was out of the opening book and if the evaluation changed
>>>>>adversely by a certain amount over the next five or ten moves, then Crafty
>>>>>snipped.  If so, the final outcome of the game has no impact on book learning in
>>>>>Crafty. Did I get that right?
>>>>
>>>>
>>>>
>>>>
>>>>Partially.  That is one learning mode.  "book learning".  There is another
>>>>more harsh version called "result learning".  Here, if it wins or loses
>>>>after playing a particular opening, it will repeat or avoid that line
>>>>depending on the game result, which is more "violent" than the simple
>>>>book learning.
>>>
>>>Bob H., I dislike your "violent result learning." [Like a killer karate chop]
>>>
>>>In the first place, I am a non-violence sort of guy. [Karate is BAD!]  : )
>>
>>As many know, I had the same experience.  I did it for 25+ years, but about
>>12 years ago my doctor said "stop the kicking nonsense or get ready for a
>>complete knee replacement." (this on a knee that was hurt playing high-school
>>football in 1964 and surgically repaired (or butchered as my current doc
>>calls 1964 medicine).  So I suppose it is bad in some respects... :)
>>
>>
>>>
>>>If the engine has code in it which is capable of monitoring progress of position
>>>evaluation scores [throughout the game] and if this code processes such
>>>information, then I see no obvious reason why the final result of the game
>>>should be a major driver in the decision "to snip or not to snip" ["That is the
>>>question" Shakespeare].
>>
>>
>>A point or two.  There are two things you learn if you watch your program
>>play a complete game.  (1) when you drop out of book, you notice that the
>>position is favorable, unfavorable, or fairly equal.
>
>To some extent that determination can be made by humans PRIOR to the new engine
>playing any games at all.  This human determination can be based on a
>combination of historical human experience with that opening with the historical
>silicon experience.  It is well known that certain openings popular among humans
>are actually objectively disadvantageous.  For example, people play Sicilian not
>because it offers clear equality but because it offers an imbalanced position
>where there are many ways for White to go wrong.

I would agree, but this kind of "preparation" is also done by the
computers.  IE that is what Crafty does when it spends several minutes
building a binary opening book from a couple of million PGN games...

It is looking at how the games ended up, how often a move was played,
the static evaluation at that point in the game, etc...


>
>>How you make that
>>judgement can vary, you can look at the first search (first move out of
>>book) or the 5th move out of book, or a combination (as I do in Crafty) of
>>the first 10 moves out of book.  But from that you might learn that you
>>really don't want to play the position, because you could come out of book
>>in a bad position, but your opponent blunders and you still win.
>>
>>My "book learning" however, does look at what happens after leaving the book,
>>to see if the program was happy or not.  And it will remember that so that
>>the next time it will try it again if it was happy, or it might try something
>>else if it was not so happy, but here the "not so happy" move is not marked
>>"do not play" it just sinks to the bottom of the list of playable moves and
>>we try the others first to see if we can improve things.
>>
>>(2) When the game ends, you now know what that book line leads to against
>>this opponent at least, and maybe against all opponents.  There is a difference
>>between getting a superior or inferior book position, and whether you win or
>>lose the game.  In a match, you _must_ vary if you lost, or you will lose
>>again.  That's why "result learning" is needed.  This is often called
>>"aggressive book learning" while the normal book learning in Crafty might be
>>called "fishing around book learning"..
>>
>>In crafty, my "result-learning" does not reward wins, but does punish losses
>>so that they don't get repeated.  It doesn't reward a win because it might have
>>been a blunder (meat makes mistakes) and repeating might not be wise.
>>
>>
>>
>>
>>>
>>>It often happens in real life that games start going horribly wrong long before
>>>the end of the game.
>>
>>Certainly, but remember that we are talking about the "computer chess player"
>>here, which is made up of the opening book and the engine.  If the engine is
>>deterministic, it is up to the opening book to vary so that we won't play the
>>same game again and lose it again.
>>
>>
>>
>>>
>>>I cannot speak as a 3600 player, of course.  Nevertheless, decisions about
>>>keeping, modifying, or rejecting opening lines should depend more on what
>>>happens soon after leaving the opening than, say, on some blunder eighty moves
>>>later in an endgame.
>>
>>Again, in the case of Crafty, I do both.  But clearly if I lose, and I play
>>the _same_ opening again, I am going to lose _again_.  Remember that the
>>search engine is completely deterministic from that point of view.  You have
>>to do _something_ or else wrack up a steady stream of losses each time you
>>play that color with that opening.
>
>One general remark:
>
>It seems to me that it should be usually possible for the engine to determine
>whether or not the current opening book should be blamed for a loss.  If it is
>determined that the loss was likely due to blunders or weak moves played later
>in the game, then changing the opening may not be indicated at all, other than
>making the line longer and then forcing a deviation later in that new line.  I
>don't think Crafty does that because you said in an earlier bulletin that Crafty
>does not add moves to it's opening book.

Correct, it does not add to the book.  MchessPro is the only program I have
ever seen that does this (adds moves).  However, in the case of computer chess
there are two worlds.  In the "ideal" world, you are correct.  You should be
able to attribute the loss to the book or to the program's play after reaching
a perfectly good book opening position.  And that can be done and actually _is_
done in Crafty.  But, in the "real" world, we play matches, and it is critical
that we not reach a reasonable position only to lose it with bad play.  The
book can cover for that "bad play" if we can make it lead us into a different
position that we might play better...

With Crafty, you can use either/both of those learning modes...

learn=1 enables book learning alone, which is an assessment of what happens the
first ten moves out of book.  Then the book is adjusted to favor or disfavor
that line (and by how much) depending on the factors discussed in the learning
paper.  learn=2 enables result learning alone, which disables openings that led
to losses, regardless of how the position looked on leaving the book.  learn=3
turns both on.


>
>Incidentally, "completely deterministic" is unfair since you went to a lot of
>trouble in earlier threads to show otherwise.  The times used by opponent for
>each move are not under the control of the engine.  [The specific individual is
>not under the control of the engine either but maybe the engine can at least
>notice who the opponent is and use prior information as to the best openings to
>play against that opponent.]
>
>Yet again, I ask:  "Did I get that right?"
>

Real world:  a non-parallel program will play the same move if given the same
amount of time and if it ponders, if the opponent takes the same amount of time.
 The only vague point is "what is the same amount of time?"  The answer is that
it varies, but in blitz-type games, 3-4-5 seconds of variability might not be
enough to make the program vary in its move choice.

I often talk about non-deterministic behavior, but that is in the context of
trying to prove that a program actually played a move by repeating the position
at a later time.  If that fails once out of a hundred times, it is worthless as
a proof of anything.  But in games, if you play the same moves most of the time,
that's more than enough to get you killed, over and over and over.

So computers are deterministic _enough_ to cause themselves great problems in
repeating games, while they are non-deterministic enough to make it very
difficult to prove that they actually played a move by themselves without any
human assistance, because they might vary every now and then.  Throw in a second
cpu or two or three and that variability goes up.  But matches, such as in the
SSDF, are not using parallel search.  Of course they often don't use an engine's
book learning since they run them via some commercial GUI...






>Bob D.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.