Author: Stephen A. Boak
Date: 03:57:38 07/29/00
Go up one level in this thread
On July 29, 2000 at 06:02:41, blass uri wrote: >I understand the ELO system. No you don't--based on what I read in this posting of yours. And I am not trying to judge you harshly because your English may not be perfect. >The elo system does not use all the information to get the best estimate for the >elo. Yes it does, emphatically, and scientifically (meaning objectively, without subjectivity; meaning unbiased). > >It is using only results and not the games. Right, since an ELO rating is all about predicting the relative results of a player versus other rated players. I said 'results', meaning % of available points scored by the player--i.e. (Wins + 0.5 * Draws)/(Total Games Rated). What good is a rating system, if it doesn't predict *results*. An ELO rating is designed (that is its mathematical nature, its intended function) to predict results *of a series of games* against opponents of known ELO rating, not how well a player (or computer) will do in a particular postion, whether opening, middlegame, endgame, whether closed or open, whether when kings castled on the same side or kings castled on opposite sides or with one or both kings uncastled. This bears repeating--an ELO rating does not even attempt to determine how well a rated player (even if a computer) will do in various phases of the game, or in various types of postions. All those details are subsumed in the strength assigned to the player based on the bottom line--results: Win, Draw or Loss--against rated opponents. Against multiple rated opponents, not just one opponent, since an ELO rating is a relative measure of one player's strength versus many other players in the same rating pool, versus the average strength of the pool on the whole. Analyzing in detail the playing styles and skills, contrasting two different players (human or computer), assigning personal rating numbers to programs based on their hodge-podge (mix) of skills and abilities across many positions, is all well and good. But that is not what the ELO system is designed to do. That is a personal system one may devise (for perfectly valid reasons)--but not *the ELO sytem* of Dr. Arpad Elo. > >I am sure that it is possible to do a calculating rating program that will give >better estimate for the rating by not only counting the results but also by >analyzing the games and evaluation of programs. The ELO system is not designed to predict perfectly a single game, or even the results of playing a single opponent (for example, A versus B only). It is designed to predict the %score expected when A plays B, C, D, E, F, G, etc, of with known ELO ratings and ELO rating average. Because the ELO system presupposes natural variability, it doesn't guarantee any particular score, against any particular individual (nor against any particular field of opponents). The ELO system doesn't only predict results. It handles the adjustment of the player's rating, according to recent *results*. It adjusts an ELO rating up, when the % of points scored is higher than that predicted by the relative ELO ratings of the player and each of his opponents. It adjusts an ELO rating down, when the % of points scored is lower than predicted by the relative ELO ratings of the player and each of his oponents. > >It is not simple to do this program and I am not going to do it but it is >possible. > > >Here is one example when you can learn from analyzing games things that you >cannnot learn from watching results without games: I agree you can learn things from watching the details (move choices) of a game--about both players. > >Suppose you see that in one game program A outsearched program B and got >advantage by the evaluation of both programs. > >The evaluation of both programs was wrong and program A lost because the >position that both programs evaluated as clear advantage for A was really a >losing position for A. > >If you analyze the game you can understand it and increase A's rating based on >this game. Uri, if the positional skills of program B outweigh (normally) the increased search capability of program A, then it is possible that program B is stronger than program A. By stronger, I mean that B will achieve better results than A, in a head to head competition. Perhaps A outsearches B only on rare occasions (even in several observed games in a row). Or A oursearches B (as in your own given example) but A doesn't win the game (as in your example). How can you conclude A is better (based on deeper search) when the *results* of that search didn't obtain a victory. Only by taking several games (a reasonable number, a sufficient number--whatever that figure is for mathematics, statistics, and common sense) and using the *results*, i.e. relative score of each participant, can you relatively rate the two programs. Since we use ELO type ratings to compare A versus B, C, D, E, etc, we really need some *results* of A against several other program, and the other programs each against several other programs, in order to provide a meaningful ELO that we can use to rank the several programs, including A. The fact that A is better in the endgame than B, and B is better in the opening than C, doesn't mean that A will be better than C. An ELO rating is based on *results*, nothing more and nothing less. We could, however, create a URI rating, based on the URI system of game observation. That is reasonable--but it is not an ELO system, and in fact ignores some of the important underpinnings of the ELO system as laid out by Arpad Elo. When A plays B, we can say that the ELO ratings *suggest* that A will beat B by a 2 to 1 margin, since A is approx 100 rating points higher than B. But the ELO system never guarantees a 2-1 ratio of victory for A. Natural variability exists in many aspects of such a match. In the measurement of how strong A really is, and how strong B really is, and in how A actually plays in a specific match, and how strong B actually plays in a specific match. Since an ELO rating is established in a *pool* of different players, the relative ELO ratings of A and B *may not* apply to a direct matchup of A versus B. It is not a transitive property that: If A is better by x points than the average ELO of a large rated pool, and if B is better by x-y points (x, y are positive) than the average ELO of the same large, rated pool, then A will beat B in a head to head match. Why? Because the ELO rating is a relative measure of A versus the pool in general, and of B versus the pool in general, not simply and directly a measure of how well A will do versus B. > >Uri Thanks for listening. I read most of your posts (not *all* of them, heh heh). You have many interesting things to say, right or wrong (or not proven). In this case, I had to reply and disagree with your comments. :) Take care, my chess friend, --Steve Boak
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.