Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: I'm wrong about 10-0 vs 60-40

Author: Hermano Ecuadoriano

Date: 10:55:40 02/04/01

Go up one level in this thread


On February 04, 2001 at 10:43:13, Ralf Elvsén wrote:

>On February 03, 2001 at 04:35:45, Andrew Dados wrote:
>>
>>The base of ELO system is 'we need to assign some numbers to players that will
>>obey Normal Distribution'. So you calculate ratings in that way.
>
>What are you saying here? That if we apply this rating system (based
>on the formula below) the resulting numbers in the rating pool
>will be normally distributed?
>Or that we assume that the "true" ratings are normally distributed
>and we therefore apply this system? Or something completely different?

Yes, that is exactly what happened.
Not knowing how the ratings were distributed, the mathematician made a
(pretty good) guess that the normal distribution would be a good model:
many in the middle, few at the bottom and top.
We don't know the "true" distribution, so we proposed a model, and it has
worked O.K.

BEGIN DIGRESSION ABOUT MATHEMATICAL MODELS
But here is a general principle I'd like to point out.
Once a mathematical model is chosen, and is tested for appropriateness,
thereafter, in the course of applying the model, performing calculations,
etc., we are LOOKING AT THE MATHEMATICAL MODEL, not at the REALITY THAT
IT IS MODELING. We agree to accept the original judgement, because it works
O.K., and forget the original incomplete knowlege and guesswork.
But, after a great deal of study, a mathematical model can become ingrained
into the Intuition, so that it starts looking like REALITY ITSELF.
Case in point:
Newton invented a model for celestial mechanics.
It was proven to as many decimal places as we could.
We studied it at great length, thereby developing an intuitive certainty
about it.
While everybody was STUDYING THE MATHEMATICAL MODEL, somebody went back,
as better instruments and other ideas had become available, and actually
LOOKED AT REALITY again, and invented relativity and quantum mechanics,
etc. (It turned out they were a superset or extention of Newton.)
Many people spent their entire lives refusing to accept, or trying to
disprove the new ideas, as though Newtonianism were a religion, because they
had forgotten, or never knew, the nature of the invention. In short, because
they allowed a mathematical model to become unquestioned in their intuitions,
they in effect deified Newton, and made a scripture of Newtonianism. Let
this be a general warning about such things. (Because physicists are doing
this again with quantum mechanics, they have abandoned any reasonable
epistemology which will someday guide those who supersede THEM. Then the
next guy brave enough to let go of the dream-world, and look at reality
again, with new tools that might not exist yet, will be hailed as the next
genius? Is that all it takes: opening ones eyes?. Well, they are only human.
P.S. I'm one of those who worshipped Newton.)
END DIGRESSION

Back to chess.
I am commenting not just on this thread, but about most of these threads
that never resolve anything. While very carefully argued and well-meaning,
they are missing the point in an unwise way.

I am unwilling to review statistics at this time. But here are some
suggestions. I could make more justifications for each of these.
1. The standard deviation should decrease (smoothly) as the ratings increase.
The actual strength of a Grandmaster is more consistent than that of a club
player. Everyone has a calculator or computer now. It is not difficult.
2. The standard deviation in a computer-computer rating system should be much
less. A computer is much more consistent than a human.
3. At the very top and bottom, the standard deviation could be made
asymmetrical, to counter the tendency (of the top specifically) to blow up,
(having noone to lose to).
4. I have some more difficult ideas too.

Where is my justification, even proof? It is no more or less than a judgement
similar to that which Dr. ELO used to fashion a useable model. #1 and #2
above, and others, are not reasonably questionable. Once someone proposes
some reasonable numbers, the adjustment is purely a political or social
matter.
Here is a reasonableness test: Is it reasonable to believe that the
traditional formula, which is very simple, can contain the vastness of this
chess-universe, which seems sometimes to be as big as life?

There are theorems now, somewhere in Information Theory, that are concerned with
how much complexity can be modeled or described by a formula having so many
variables, so many points of inflection, etc., that could be used as a
convincing proof that the ELO formula cannot be adequate.

That is what is really going on. Those of you "flipping coins" should clearly
understand that that is the ONLY thing you are doing relative to this problem.


>
>Ralf
>
>>
>>You can take it as definition of ELO system. If you need some numbers which obey
>>different distribution, then you can devise your own rating system, but ELO
>>definitely obeys normal distribution of ratings (as it defines ratings in that
>>way).
>>
>>Practically for fide and uscf standard deviation (sigma) is about 280. That's
>>what simplified formula of 1/(1+10^(-k/400.0)) used to calculate ratings
>>implies.
>>
>>If you ever used Mathematica this is the 'real thing':
>>(sig is Sigma)
>>
>>Dist[X_]=1/(sig*(2*Pi)^0.5)*Exp[-X*X/(2*sig*sig)];
>>P[D_]=Integrate[Dist[X],{X,0,D}]+0.5; (* Integration from 0 to D *)
>>
>>You definitely have your point about 'not enough data to anchor sigma' thing,
>>but for starters and for most real life match scores you can even simplify that
>>'normal distribution' model and say: all rating differences are distributed
>>equally. Within the range of +-200 ELO difference and around most programs
>>strength (being way above avg of 1740 rating) it will be valid enough to draw
>>conclusions....
>>



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.