Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Let's back off for a minute from Rc6

Author: Ratko V Tomic

Date: 08:40:42 10/22/00

Go up one level in this thread


> So you are proposing the program play the opponent as opposed
> to the position.

The optimum bet size suggestion is a generalization of already
common practice to set program's contempt level depending
on the strength of the opponent. If a program is playing much stronger
opponent it may take a draw offer even though it is evaluating
+1 in its favor. The programs operators know that if it continues
playing, it would be gambling in each future move, since it cannot
evaluate perfectly. Knowing that it will be making errors and
knowing that the opponent will likely be making fewer/lesser errors,
it is probably wiser to take the half point than to bet it can
hold on and magnify the current gain. Similarly, when playing
against a weaker opponent, one may refuse a draw offer, even if
one is evaluating -1 in ones favor.


It is clear that the contempt setting is a special case of the bet
size adjustment method described -- while the edge dependent bet
size adjustment suggested is sensitive to the relative difference
in strength for certain type of positions, the contempt level is
sensitive only to the overall relative strength difference (or even
merely the plain rating, which isn't even sensitive to opponent
specific strength difference). Hence, contempt level tuning is a
more blunt or more blind (to the specific position type) form of
the bet adjustment method which takes into account not only the
strengths relative to a specific opponent but also the strengths
for certain type of positions.

Now, one should recognize that even though the programs do make
evaluation errors, thus on every move there is a gambling component
in the decision process, this gambling isn't quite the same as coin
tossing. Namely the coin tossing is a memoryless process -- the
outcome of the next try doesn't depend on the outcome of the
previous try. That's not true in chess at all, where generally,
much like in real life, if you're rich you'll be given more, and
if you're poor it will be taken away from you. Chess has a positive
feedback loop, where a small gain increases ones relative fire-power,
as it were, which in turn increases odds of making another gain,
which in turn will increase relative fire-power, etc, -- the success
feeds on itself producing more success. Therefore the formula for
the optimum bet size C*(2P-1) will not have a constant P but P
will depend on the position and especially material balance in some
way, and these will change from move to move.

But, as commented earlier, program (or programmer) doesn't know what
P is anyway, so he wouldn't be using the formula (2P-1)*C directly in
any case. Only the general relation between bet size and C or relative
strengths in given type of positions (related to P) need to be accounted.

For example, the more material on the board, the greater the optimum bet
should be. Or the greater capability edge the program estimates to
have against its current opponent in given type of position, the
greater the optimum bet should be. Just as programmer may set the
contempt level before the game against a known opponent, one can
set the position type specific risk-aversion parameters the program
has depending on how much edge one judges to have (or to lack)
against the given opponent in those position types.

It is the same kind of adjustment. Of course, one would tune the
optimum contempt threshold depending, say, on the rating difference
between the programs e.g. one could use stats on how many points
one had on average against given rating gap given the + evaluation
one may have now -- say, if the past average points per game were 0.4
with +1.0 evaluation against the 200+/-Delta points stronger
opponents, one should offer or accpet a draw offer (in a tournament
other aspects may weigh in as well).

Similarly, one could tune all other optimum bet size computations.
Programmer doesn't know what P's are or even what precise formula
is, but he does know which way to adjust its risk-aversion parameters
based on perceived edge against given opponent in some aspect of
the game and the given 'net worth'. What gets tuned is _how much_
exactly to add or subtract to/from risk preferences, which in turn
defines the max bets program will compute and use (when the edge
presence is detected), but not whether to add or subtract from
risk preferences.

In a longer match against a fixed opponent, such as the SSDF matchups,
the program may keep the running score against the given opponent and
use it to adjust the relative strengths (the pair specific rating
difference) which in turn are used in the automated contempt level
adjustment. This same running score could also be used to automatically
adjust its more general risk aversion parameters, such as those based
on position specific edge.







This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.