Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Update on Rebel -Lithuania Re-match?

Author: Ratko V Tomic

Date: 16:03:41 10/16/99

Go up one level in this thread


> First of all I don't know of anyone that is claiming programs are 2600+ at
> 40/2. The projections I have heard was 2500+.  And since Rebel is very
> near that rating now, I don't see why this is considered so rediculous.

On SSDF, all top programs are well above 2600 (Fritz 2681, Nimzo99 2678, etc;
plus all the top ones from Pentium 200 list, but untested on K6-2, 450). So that
kind of inbred rating is what some might imagine will extend and apply against
humans at longer time controls and over longer sequence of games against same
opponent(s). At earlier times SSDF used to have a good deal of human games in
the mix and this kept the against-human rating within plausible range. Then you
could notice the commercial program's human rating slide over time downwards (I
followed the chess units that I had at any given time and noticed the slide,
chess units which I bought as 2200+ went down into 19 hundreds). Of course, they
slide today, too, in comp-comp competition, but that's because the new programs
are better or run on faster hardware. Humans, how didn't improve in human
rating, somehow started getting more points out of any given chess computer
model.


> You state that you
> "believe" the rating would drop if the match was repeated seven times.

I don't think I said anything like 7 times(that might have been someone else in
another thread). I only said if they were to play week after week, the Rebel's
rating will drop against that tem, althouh, like with coin flips, any given week
result can go up or down, but the average over several weeks would show the
programs downward slide (or same for the average in the first half of the
matches vs the average of the 2nd half of the matches).

> Can you give some statistics or evidence to back this up? Frankly the
> idea that humans somehow are going to improve 100 pts against humans
> over time is somewhat suspect to me.

You meant to say against computers. In any case, I said computer's rating will
drop against a given human team, not that human rating will appreciable increase
(if at all) in games against other humans. A well motivated human team would,
after some practice, learn the general weaknesses of the program (not just the
opening lines which can easily be varied or learned by the program), but
fundamental limitations, like lack of longer term planning, lack of common sense
(e.g. if you know that I have prepared well and I offer you a seemingly free
pawn at depth you know I would see such loss, you know you should be very
cautious, and might not take it even though you can't see at the board why not,
especially if you got burned in similar ways few times before; program will
happily take it every time, as long as its evaluation doesn't show any danger in
sight), predictable greed (you can draw its queen off the kings side with any
freebie on the queenside, offerd in a most shamelessly transparent way, it goes
for a free pawn every time (as long as it doesn' see a drop off in its
evaluation), even though it may have lost few games before on the same kind of
cheap trick), etc.

You're basically dealing with an equivalent of an idiot on amphetamines, very
fast, yes, but still a complete idiot in its core, despite the appeareance of
intelligence and erudition (due to the vast amounts of memorized openings,
making it look as if they're strategizing, or to surprise tactical shots which
is their primary genuinly strong side).


> Could you give me a specific example of what you mean? Are you
> saying the human for instance will discover some weakness in how the program
> handles the kings indian, or sicilian and thus exploit that weakness? I guess
> this whole idea is unclear.  Since the computers book will navigate it around
> many of these problems I really cannot see what you mean.

Not a specific opening. As you say, that can easily be bypassed. But it is the
general rigidity of its method and its limits.

The programs evaluation function is inexact, and this will be so as long as they
can't traverse from any given position to the end (checkmate or draw) in all
your responses (which is almost always true, except for the sparse populated
endings for which they have tablebases). Once the program reaches its designated
depth, and resolves captures and checks to a quieter position, and then adds up
all the material and the position evaluation terms, it gets a number, which
varies between different variations and iterations, like a little random noise,
perhaps 1/4 pawn worth up or down. There are usually several choices from the
root falling within this noise band. The small difference between them are
essentially worthless noise, but program will pick the "best" one.

Now, each of these choices (from the root) has also aspects which have longer
term consequences, in particular how well they fit into various long term
strategic plans, pursuing the goals which may stretch as far as endgame, and
having a number of intermediate subgoals to be achieved in stages. A human
player will, like the programs, have several choices and, like them, he will
check, as much as he can, the tactical consequences. But he will also check how
well such moves fit various plans he considers viable in that type of position.
(He will also actively seek to find a move which advances such plans.)

Computer, on the other hand, will in the current move pick something which may
fit one plan and its chain of goals, in the next move it will undo that advance
and pick something else matching some other plan, since the noise behind its
evaluation function drives it in random directions, within that noise margin.
Over the span of many moves, human choices will add up in one direction,
advancing thus consistently toward the goals along the lines of the selected
plan (goals may change of course, but pnly if position demands it), and overall
the little steps will add up to something useful. Program's choices, based at
this small level of differences (in the evaluation noise band), will cancel out
any such coherent or consistent advance in small steps.

Like a kid in a candy store, it will hop happily from one 0.1 pawn equivalent
"gain" to the next one, without a clue what to do with the "gain" by itself
(what is it good for, a stew of numerous pluses and minuses of unrelated
strategic bits of "knowledge"), merely waiting if a tactical shot will turn up
eventually, and unless it gets this "lucky" tactical shot, or the human plan
turnes out to be flawed (unrealistic, poorly decomposed into subgoals, etc, or
if human has no plan at all), its position will get worse for reasons that are
fundamentaly beyond its search algorithms. That's why they fall for the long
range, slow unfolding, king-side attacks, the attack builds in small increments,
with fruits too far away to be seen in the search (until it is too late to
defend against).


There is no, at present, a theory of anti-computer play, or systematic
guidelines and strategies. Only bits a pieces of advice and intuition about it
people build if they play a lot against the strong programs. Most of us who did
and do play against programs, notice that over time, the same program seems to
weaken and we need a newer and stronger engine, or another program. How long it
takes, depends obviously on the initial difference in program's apparent
strength (which is the strength of its strongest aspect) and the player's
strength. By the time program reaches its bottom against you, you've learned to
play against its weakest aspect. So the apparent drop in the program's strength
against human depends on the difference between its strongest and its weakest
aspect. While they may be 2900 strong in a 10-14 ply messy tactics, they're
maybe 2000-2100 in strategy (varies somewhat between programs).

The real, durable, strength of a program (a plateau to which it will slide
against you over time) isn't some average of all such aspects, say, a figure
around 2500-2600, which is what comp-comp matches are showing, but it is a
figure closer to their weakest aspect, maybe around 2300. You won't in practice
(unless ypu're especially motivated and stable player) drive them all the way
down to 2100, or whatever their bottom is, since most of us have our weaknesses
and can't consistently handle the pressure of hidden tactics a program may
discover, even when it has become sparser and less hidden and less dangerious
due to our intuitive "defanging" strategies driving the positions away from its
strong arm.




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.