Author: Ratko V Tomic
Date: 16:03:41 10/16/99
Go up one level in this thread
> First of all I don't know of anyone that is claiming programs are 2600+ at > 40/2. The projections I have heard was 2500+. And since Rebel is very > near that rating now, I don't see why this is considered so rediculous. On SSDF, all top programs are well above 2600 (Fritz 2681, Nimzo99 2678, etc; plus all the top ones from Pentium 200 list, but untested on K6-2, 450). So that kind of inbred rating is what some might imagine will extend and apply against humans at longer time controls and over longer sequence of games against same opponent(s). At earlier times SSDF used to have a good deal of human games in the mix and this kept the against-human rating within plausible range. Then you could notice the commercial program's human rating slide over time downwards (I followed the chess units that I had at any given time and noticed the slide, chess units which I bought as 2200+ went down into 19 hundreds). Of course, they slide today, too, in comp-comp competition, but that's because the new programs are better or run on faster hardware. Humans, how didn't improve in human rating, somehow started getting more points out of any given chess computer model. > You state that you > "believe" the rating would drop if the match was repeated seven times. I don't think I said anything like 7 times(that might have been someone else in another thread). I only said if they were to play week after week, the Rebel's rating will drop against that tem, althouh, like with coin flips, any given week result can go up or down, but the average over several weeks would show the programs downward slide (or same for the average in the first half of the matches vs the average of the 2nd half of the matches). > Can you give some statistics or evidence to back this up? Frankly the > idea that humans somehow are going to improve 100 pts against humans > over time is somewhat suspect to me. You meant to say against computers. In any case, I said computer's rating will drop against a given human team, not that human rating will appreciable increase (if at all) in games against other humans. A well motivated human team would, after some practice, learn the general weaknesses of the program (not just the opening lines which can easily be varied or learned by the program), but fundamental limitations, like lack of longer term planning, lack of common sense (e.g. if you know that I have prepared well and I offer you a seemingly free pawn at depth you know I would see such loss, you know you should be very cautious, and might not take it even though you can't see at the board why not, especially if you got burned in similar ways few times before; program will happily take it every time, as long as its evaluation doesn't show any danger in sight), predictable greed (you can draw its queen off the kings side with any freebie on the queenside, offerd in a most shamelessly transparent way, it goes for a free pawn every time (as long as it doesn' see a drop off in its evaluation), even though it may have lost few games before on the same kind of cheap trick), etc. You're basically dealing with an equivalent of an idiot on amphetamines, very fast, yes, but still a complete idiot in its core, despite the appeareance of intelligence and erudition (due to the vast amounts of memorized openings, making it look as if they're strategizing, or to surprise tactical shots which is their primary genuinly strong side). > Could you give me a specific example of what you mean? Are you > saying the human for instance will discover some weakness in how the program > handles the kings indian, or sicilian and thus exploit that weakness? I guess > this whole idea is unclear. Since the computers book will navigate it around > many of these problems I really cannot see what you mean. Not a specific opening. As you say, that can easily be bypassed. But it is the general rigidity of its method and its limits. The programs evaluation function is inexact, and this will be so as long as they can't traverse from any given position to the end (checkmate or draw) in all your responses (which is almost always true, except for the sparse populated endings for which they have tablebases). Once the program reaches its designated depth, and resolves captures and checks to a quieter position, and then adds up all the material and the position evaluation terms, it gets a number, which varies between different variations and iterations, like a little random noise, perhaps 1/4 pawn worth up or down. There are usually several choices from the root falling within this noise band. The small difference between them are essentially worthless noise, but program will pick the "best" one. Now, each of these choices (from the root) has also aspects which have longer term consequences, in particular how well they fit into various long term strategic plans, pursuing the goals which may stretch as far as endgame, and having a number of intermediate subgoals to be achieved in stages. A human player will, like the programs, have several choices and, like them, he will check, as much as he can, the tactical consequences. But he will also check how well such moves fit various plans he considers viable in that type of position. (He will also actively seek to find a move which advances such plans.) Computer, on the other hand, will in the current move pick something which may fit one plan and its chain of goals, in the next move it will undo that advance and pick something else matching some other plan, since the noise behind its evaluation function drives it in random directions, within that noise margin. Over the span of many moves, human choices will add up in one direction, advancing thus consistently toward the goals along the lines of the selected plan (goals may change of course, but pnly if position demands it), and overall the little steps will add up to something useful. Program's choices, based at this small level of differences (in the evaluation noise band), will cancel out any such coherent or consistent advance in small steps. Like a kid in a candy store, it will hop happily from one 0.1 pawn equivalent "gain" to the next one, without a clue what to do with the "gain" by itself (what is it good for, a stew of numerous pluses and minuses of unrelated strategic bits of "knowledge"), merely waiting if a tactical shot will turn up eventually, and unless it gets this "lucky" tactical shot, or the human plan turnes out to be flawed (unrealistic, poorly decomposed into subgoals, etc, or if human has no plan at all), its position will get worse for reasons that are fundamentaly beyond its search algorithms. That's why they fall for the long range, slow unfolding, king-side attacks, the attack builds in small increments, with fruits too far away to be seen in the search (until it is too late to defend against). There is no, at present, a theory of anti-computer play, or systematic guidelines and strategies. Only bits a pieces of advice and intuition about it people build if they play a lot against the strong programs. Most of us who did and do play against programs, notice that over time, the same program seems to weaken and we need a newer and stronger engine, or another program. How long it takes, depends obviously on the initial difference in program's apparent strength (which is the strength of its strongest aspect) and the player's strength. By the time program reaches its bottom against you, you've learned to play against its weakest aspect. So the apparent drop in the program's strength against human depends on the difference between its strongest and its weakest aspect. While they may be 2900 strong in a 10-14 ply messy tactics, they're maybe 2000-2100 in strategy (varies somewhat between programs). The real, durable, strength of a program (a plateau to which it will slide against you over time) isn't some average of all such aspects, say, a figure around 2500-2600, which is what comp-comp matches are showing, but it is a figure closer to their weakest aspect, maybe around 2300. You won't in practice (unless ypu're especially motivated and stable player) drive them all the way down to 2100, or whatever their bottom is, since most of us have our weaknesses and can't consistently handle the pressure of hidden tactics a program may discover, even when it has become sparser and less hidden and less dangerious due to our intuitive "defanging" strategies driving the positions away from its strong arm.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.