Author: Robert Hyatt
Date: 14:06:29 06/21/00
Go up one level in this thread
On June 21, 2000 at 11:33:12, Mogens Larsen wrote: >On June 21, 2000 at 11:16:48, Robert Hyatt wrote: > >>Crafty's book learning was written up in the JICCA a year or so ago. I can >>probably dig up an electronic copy if needed. It was very specific in >>explaining how the book learning works. "position learning" was written up >>in the JICCA by Dave Slate and Tony Scherzer several years ago. I basically >>implemented exactly what they did for position learning. > >If it's possible, could you outline the learning method applied with Crafty? It >doesn't have to be in great detail, "just" the underlying principles. I would >like to see the JICCA article about book learning if you can find it. > >Best wishes... >Mogens For an ASCII copy, email me. The learning algorithm works like this: 1. normal book learning. I save the scores for the first 10 moves out of book, and use them to develop a "learning value". I use 10 scores so that if I start off at -1.00 (a pawn gambit) but later the score climbs up, then I notice that the learn value should not be bad. Basically I use the "high point after the lowest point" if the score starts off negative and drops (this is the score caused by a gambit where the score continues to drop a few moves until it can see that the resulting attack or counter-play is worthwhile.). If the score starts off positive, I use the low point after the highest point. This is where I accept a gambit, and the score continues to climb a few moves before it begins to "see the truth". I then modify this learn value based on the search depth (lesser depth brings the score closer to zero as it is not as trustworthy) and opponent's rating. If the score is negative, and the opponent is higher-rated, I don't adjust the score. If the opponent is lower-rated and was _still_ able to drive crafty to a negative score, I make it even more negative as this must really be a bad opening. If the score is positive, and the opponent is lower-rated, I drag the score down since crafty ought to get + scores against weaker players.If the opponent is higher rated, I adjust it up a bit for the same reason. I then backtrack thru the book line averaging in this learn value until I reach a point (going backward) where the program had a choice. I reduce the learn value by dividing by the number of choices and continue working back toward the root, dividing whenever there are choices. The idea is that if you have 3 choices, then maybe just one is bad. You try it. If it is bad, the learn value will steer us away from that move, to one of the other two. If that is a negative score, we divide by 2 (the first move is already marked as bad so we only have two choices now). The next time we get here we try the last move and repeat... That is brief. If you have questions, feel free to ask.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.