Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Comments of latest SSDF list - Nine basic questions

Author: Dann Corbit

Date: 10:53:51 06/04/02

Go up one level in this thread


Here is how the ELO system works:
1.  It assumes that the data follows the normal curve.  In other words, there
are a few very good players, a few truly terrible players, and lots of mediocre
players.  Most data fits this pattern (heights of men, length of manufactured
objects, grades in a class...).

Now, with this data, we can calculate a curve that is a projection through many
data points that looks something like the bell curve (it may be skewed a bit but
that won't matter).  If it happens to be some other odd distribution like a
gamma distribution then we might have problems.  But with chess, the data does
appear to be normally distributed.

Now, what the ELO calculations do is produce a number for each player in a pool
of players.  The actual number (e.g. 2450, 1490, 2900) is completely immaterial
in its value.  The thing that does matter is the difference between a number and
the other numbers in the pool.  If a number is 100 units higher than another
number, then we have a difference of 100.  This tells us the win expectancy.

Now, since the values in the ELO calculation are derived from the data itself,
the end product is a data set that is self-calibrating.  In other words, the
reason that one program or one person is 100 ELO higher than some other person
or program is that within the pool of data points, it really did win more games
against opponents of the same or better strength.  Hence, we have a crude
predictor for future games.  Now, you cannot say that a program which is 150 ELO
lower will get exactly 29.6615% of the points in a single match against a
program that is 150 ELO higher.  But in a broad group of many matches, it will
average pretty close to that.

So, the question is:
"What does the SSDF data mean, and what does it predict?"

The SSDF data is played on platforms of known strength, and with fixed versions
of the programs.  The programs are played (when possible) under auto-232 so that
human intervention is not required.  The programs play while you work and while
you sleep.  Tirelessly, they plug away.  Eventually you get a giant pile of
data.  Now: What does this data mean?

It means that if you get the same machines and the same programs and play them
under the same conditions you can guess which programs will fare the best.

It means nothing more and nothing less.

There is nothing wrong with the SSDF data.  In fact, it is the best quality
chess program data which is available anywhere.  It is odd that people will
spend so much energy disparaging the SSDF data and team member efforts when it
is the best that there is and without it, we would have a much more difficult
time making judgements about how a program might perform under certain
controlled conditions.

What does this data predict?

If I have some program which has played (perhaps) 1000 games against known SSDF
configurations, I can guess what the next 1000 games will look like and be
pretty accurate.

What does the data *not* predict?

It does not predict the outcome of any single match.  A program might be 150 ELO
higher than some other program and lose a 20 game match 0-20 (very unlikely, but
still possible).  It does not predict the outcome on configurations that have
not been tested.  It does not predict the outcome against opponents of any kind
that are not included in the pool of data.



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.