Author: Rafael Vasquez
Date: 15:09:31 03/20/98
Go up one level in this thread
On March 20, 1998 at 17:11:15, Amir Ban wrote: Amir, I really liked this message. The more the programmer controls it's programs behavior, the more it would be possible for him to implement your ideas. I guess those with more knowledge will profit more on this. > >"Learner" is a big word for what we have today. Whatever benefit lies in >today's learners depends on a simple fact: Programs play more or less in >a deterministic manner when they leave book, especially if the time >controls are fixed. > >A learner avoids losing the same game twice by remembering the loss and >varying a move. The move it varies is not the losing move, and probably >not even a bad move in the ordinary sense of the word. It's just that >this line leads the program to play consistently against some specific >line to a loss. Similarly, a learner can repeat a win against a >non-learner by relying on it to repeat a losing line. You have just describe the Genius-Rebel learning feature >The same idea is behind the trick of adding autoplayed games to a book, >as is rumoured to have been done by MCPro 7.1 and used very effectively >against some including Hiarcs. MC-PRO includes moves automatically (up to 10) onto his book. Which other do this? >The victim, being deterministic, was >known beforehand to follow a losing line, and so it did. To be >effective, the autplayed games should have been played on the same >machine with the same time controls as would be used in actual >competition (SSDF, in this case). I don't think that when this debate >took place anyone noticed that the autoplayed games were probably also >played with permanent brain OFF. They had to, because otherwise the >timing would not have been right, and the exact line may not be >repeated. > >The general consensus here is that against a learner, you need another >learner. I don't think so at all. A learner would be an overshoot. This >can be handled by simpler means, that in addition need no long-term >memory as a learner does. The solution is, in one word: VARY. Humans do this as a normal automatic procedure. >Introduce variation through randomization. Don't play the same game >twice, or at least make this unlikely. Do this especially when you have >recently left book. Every programmer can probably think of half a dozen >methods to do this, but I'll anyway try to give some advice: > >Adding a small random value to the evaluation is one obvious way to go, >and may work. Drawbacks are that this may be somewhat expensive, and >that there may be unwanted complications if the same position doesn't >evaluate consistently in a single search-tree. An alternative, that I >would recommend, is to randomize some eval coeffiecients before starting >each move, and use those values consistently in the search-tree. Some >coefficients have only a small effect on the best move chosen, and some >are too important to be varied, but my guess is that the terms >controlling mobility, development and centre control, for example, may >be slightly varied with hardly any effect on strength, and are almost >guaranteed to produce a different move at least once in say 6 moves, >especially in the sort of positions that you have out of book. Basically >that's all you need. > >Another reasonable possibility is to sometime play the second-best move, >but only if it's almost as good as the best move. This is guaranteed to >produce a variation, though slightly expensive to compute. > >This is very easy to test. Play automatic games from several fixed >positions and see how many duplicates you get. Use fixed ply depths >(timing is another source of random variation that you don't want to >measure here). You don't need to avoid duplicates completely, just get >them down to a level where you believe a learner would be wasting its >efforts. > >Remember that the main source for variation is the book. If you have a >large book, you are much safer than with a small book (judging by the >MCPro success against Hiarcs, Hiarcs has a small tournament book). There >is still danger that someone will take you out of book early, and make >you follow a losing line again and again, so I propose to measure how >safe you are this way: The Junior book supplied with the Junior Engine is rather small, but somehow your program is efficient with it.Maybe making it bigger you can gain some points. >When you are following book, keep track of the probability of reaching >the current position. When you are out of book, look at the probability. >If it's below some threshold, you are fine and do nothing. If you are >above the threshold, take evasive action through randomization. You >should have a rough idea of how effective your evasive action is, and >this allows you to adjust the probability by some factor with each move. >When you are below your threshold, relax and resume your normal mode. Can we say that FRITZ is using the factors in his three as probability factors? >This procedure is "nice". It is purely defensive. If you feel like being >aggressive, go ahead and implement a learner. But learners are currently >not "nice", as some people here have pointed out. They don't really >learn anything serious about anything, but just try to exploit an >incidental feature of today's programs (determinism). > >Of course, if you meet a learner that really understands something, >watch out ! If someone can figure out that you play endgames weakly, or >that you are better in attacking the king than in anything else, or that >your search is ineffective in some situations, and furthermore knows >what to do with such information, you are still in trouble, but for >computers that's a long way down the road. If a program can store on which phase of the game won and if you can determine if was a tactical shot or not with the variation on the evaluation results, maybe is not so long the road. > >Amir Rafael Vasquez
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.