Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: More philosophy and math discussion regarding different rating methods

Author: Ratko V Tomic

Date: 15:13:32 07/29/00

>>This model is wrong and does not use the results of the games correctly to get
>>the best estimate.
>
> The model can't be wrong.  It does what it purports to do.
> It does that well.

The statistical model (memoryless random process) which ELO computation
assumes to undelie the variablity in results is certainly not the
accurate model for that variablity. The most accurate model would be
the replica of the player itself. With chess programs one can make such
exact model, not with humans. The statistical model for variability
used in ELO is the simplest nontrivial statistical model.

Of course, what you mean by "the model can't be wrong" is that one
can apply correctly the inaccurate model in the sense of recognizing
its incorrectness. That doesn't mean it reflects the process it models
accurately or even well or better than any other model.

The Uri's assertion is that the model ELO uses for mimicking variability
in results is by no means accurate (corresponding to the actual results)
and it isn't even the best one in present day and age. As he suggested,
one could in principle write a program which could extract much more
information from the game (e.g. via analysis and scoring of each ply)
than ELO model does and be more accurate in predicting results than ELO.
The ELO model uses about 1.58 bits of info for the entire game. The
strength analyzer program Uri mentioned would use about 5 bits of info
per ply, or hundreds of times more info per game about the process
than the ELO does.

Of course, ELO method is from the pre-computer era, devised to do best
one can assuming:

1) evaluator need not know anything about the chess
2) evaluator need not use anything beyond the slide rule or
   log tables and pencil and paper to compute quickly
   the ratings and the predictions.

If you drop either or both of these upfront restrictions (which are an arbitrary
historical accident of what technology was available at the time) you can do
much better in terms of predicting the outcomes from the previous games. A
trivial example of such improved prediction for comp-comp play would be to run a
simulation of the programs on a much faster computer than what they would play
in a competition for which you're trying to predict an outcome. Such model is
obviously better than ELO rating.

Re: More philosophy and math discussion regarding different rating methods Stephen A. Boak 20:00:11 07/29/00
- Re: More philosophy and math discussion regarding different rating methods Ratko V Tomic 00:55:55 07/30/00

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.