Author: Dann Corbit
Date: 10:53:51 06/04/02
Go up one level in this thread
Here is how the ELO system works: 1. It assumes that the data follows the normal curve. In other words, there are a few very good players, a few truly terrible players, and lots of mediocre players. Most data fits this pattern (heights of men, length of manufactured objects, grades in a class...). Now, with this data, we can calculate a curve that is a projection through many data points that looks something like the bell curve (it may be skewed a bit but that won't matter). If it happens to be some other odd distribution like a gamma distribution then we might have problems. But with chess, the data does appear to be normally distributed. Now, what the ELO calculations do is produce a number for each player in a pool of players. The actual number (e.g. 2450, 1490, 2900) is completely immaterial in its value. The thing that does matter is the difference between a number and the other numbers in the pool. If a number is 100 units higher than another number, then we have a difference of 100. This tells us the win expectancy. Now, since the values in the ELO calculation are derived from the data itself, the end product is a data set that is self-calibrating. In other words, the reason that one program or one person is 100 ELO higher than some other person or program is that within the pool of data points, it really did win more games against opponents of the same or better strength. Hence, we have a crude predictor for future games. Now, you cannot say that a program which is 150 ELO lower will get exactly 29.6615% of the points in a single match against a program that is 150 ELO higher. But in a broad group of many matches, it will average pretty close to that. So, the question is: "What does the SSDF data mean, and what does it predict?" The SSDF data is played on platforms of known strength, and with fixed versions of the programs. The programs are played (when possible) under auto-232 so that human intervention is not required. The programs play while you work and while you sleep. Tirelessly, they plug away. Eventually you get a giant pile of data. Now: What does this data mean? It means that if you get the same machines and the same programs and play them under the same conditions you can guess which programs will fare the best. It means nothing more and nothing less. There is nothing wrong with the SSDF data. In fact, it is the best quality chess program data which is available anywhere. It is odd that people will spend so much energy disparaging the SSDF data and team member efforts when it is the best that there is and without it, we would have a much more difficult time making judgements about how a program might perform under certain controlled conditions. What does this data predict? If I have some program which has played (perhaps) 1000 games against known SSDF configurations, I can guess what the next 1000 games will look like and be pretty accurate. What does the data *not* predict? It does not predict the outcome of any single match. A program might be 150 ELO higher than some other program and lose a 20 game match 0-20 (very unlikely, but still possible). It does not predict the outcome on configurations that have not been tested. It does not predict the outcome against opponents of any kind that are not included in the pool of data.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.