Author: Ratko V Tomic
Date: 08:40:42 10/22/00
Go up one level in this thread
> So you are proposing the program play the opponent as opposed > to the position. The optimum bet size suggestion is a generalization of already common practice to set program's contempt level depending on the strength of the opponent. If a program is playing much stronger opponent it may take a draw offer even though it is evaluating +1 in its favor. The programs operators know that if it continues playing, it would be gambling in each future move, since it cannot evaluate perfectly. Knowing that it will be making errors and knowing that the opponent will likely be making fewer/lesser errors, it is probably wiser to take the half point than to bet it can hold on and magnify the current gain. Similarly, when playing against a weaker opponent, one may refuse a draw offer, even if one is evaluating -1 in ones favor. It is clear that the contempt setting is a special case of the bet size adjustment method described -- while the edge dependent bet size adjustment suggested is sensitive to the relative difference in strength for certain type of positions, the contempt level is sensitive only to the overall relative strength difference (or even merely the plain rating, which isn't even sensitive to opponent specific strength difference). Hence, contempt level tuning is a more blunt or more blind (to the specific position type) form of the bet adjustment method which takes into account not only the strengths relative to a specific opponent but also the strengths for certain type of positions. Now, one should recognize that even though the programs do make evaluation errors, thus on every move there is a gambling component in the decision process, this gambling isn't quite the same as coin tossing. Namely the coin tossing is a memoryless process -- the outcome of the next try doesn't depend on the outcome of the previous try. That's not true in chess at all, where generally, much like in real life, if you're rich you'll be given more, and if you're poor it will be taken away from you. Chess has a positive feedback loop, where a small gain increases ones relative fire-power, as it were, which in turn increases odds of making another gain, which in turn will increase relative fire-power, etc, -- the success feeds on itself producing more success. Therefore the formula for the optimum bet size C*(2P-1) will not have a constant P but P will depend on the position and especially material balance in some way, and these will change from move to move. But, as commented earlier, program (or programmer) doesn't know what P is anyway, so he wouldn't be using the formula (2P-1)*C directly in any case. Only the general relation between bet size and C or relative strengths in given type of positions (related to P) need to be accounted. For example, the more material on the board, the greater the optimum bet should be. Or the greater capability edge the program estimates to have against its current opponent in given type of position, the greater the optimum bet should be. Just as programmer may set the contempt level before the game against a known opponent, one can set the position type specific risk-aversion parameters the program has depending on how much edge one judges to have (or to lack) against the given opponent in those position types. It is the same kind of adjustment. Of course, one would tune the optimum contempt threshold depending, say, on the rating difference between the programs e.g. one could use stats on how many points one had on average against given rating gap given the + evaluation one may have now -- say, if the past average points per game were 0.4 with +1.0 evaluation against the 200+/-Delta points stronger opponents, one should offer or accpet a draw offer (in a tournament other aspects may weigh in as well). Similarly, one could tune all other optimum bet size computations. Programmer doesn't know what P's are or even what precise formula is, but he does know which way to adjust its risk-aversion parameters based on perceived edge against given opponent in some aspect of the game and the given 'net worth'. What gets tuned is _how much_ exactly to add or subtract to/from risk preferences, which in turn defines the max bets program will compute and use (when the edge presence is detected), but not whether to add or subtract from risk preferences. In a longer match against a fixed opponent, such as the SSDF matchups, the program may keep the running score against the given opponent and use it to adjust the relative strengths (the pair specific rating difference) which in turn are used in the automated contempt level adjustment. This same running score could also be used to automatically adjust its more general risk aversion parameters, such as those based on position specific edge.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.