Computer Chess Club Archives


Search

Terms

Messages

Subject: Dangers in CC: Statistics - Some Science also on Elo's Lists

Author: Rolf Tueschen

Date: 12:49:40 02/26/03


For seven years now I reflected the question in CCC and formerly RGCC (see the
archives here and in GOOGLE) why it is so difficult to understand the critic of
the usual Elo-rankinglists in FIDE and of course in SSDF. A recent message made
it necessary to explain some basics. Someone wrote me that chess programmers
knew statistics because creating a chess program and perfectioning & tuning is
most of all a question of stats. Since the same sender is in opposition against
my critic of the actual lists I decided that a direct lesson is the best for the
benefit of the readers. I will explain without technical formules, because it is
about the very basic stuff, and that is difficult enough to understand. Alien
teminology does always sound smart but the reader is condemned to  become a true
believer or an upset disbeliever.

So for the first time I try to explain here what normally people only learn at
uiversities and that also only if they study philosophy or the methodology of
science, or if they are very smart ad understand the basics in natural sciences
and mathematics. NB that you can well study the latter fields and still must not
understand such questions, which I'll try to explain as easy as possible.
The forword is longer than the whole explanation methinks.

To me it was interesting when I read about the Elo formula for the first time in
the 70s that the authors [I think it was in the Swiss Schachwoche] gave a lot of
tables, so that you could understand the sense of Elo's ranking list. The most
interesting table was the one about the winning chances.

I never read Elo in the original so I do't know the introduction he gave for his
formulas. Doesn't matter, because I don't write against him but only against the
false application of his formula. If he did NOT make a similar introduction with
basically the same what I try to explain now, then of course it's already Arpad
Elo's mistake. Anyway, there is no reason to continue mistakes.

I think Elo had a good idea (some others had the same idea earlier in different
countries) to find a formula for the performance of the chessplayers  on the
base of their tournament results. Elo was also interested in the calculation of
numbers for the historic chess players, he tried to make their records
comparable.

Here I make a trivial statement, but this is now the basic reason for our
misunderstandings in the debates about Elonumbers. SSDF-Elo or FIDE, all the
same.

The history of chess has happened in the past. The players had to chance to
influence their Elo numbers. They simply played chess better or worse, which is
all in order with the distribution of chess strength, which looks like a
pyramide. At the top very few players and then the more you go down (weaker
chess) the more chessplayers will be there. At the base you have the biggest
quantity with the beginners.

Today we have a totally different situation. In short, we have no longer a
system of tournament chess which is based on the explained distribution of chess
strength, but we have a system where already the participation of the players is
connected with their Elo rating. Perhaps you might say that this is
uninteresting. But then you have less experience in social sciences where it is
known since long that the participation of humans leads to interactions between
human goals and the applied technology. Example: Kasparov would never play the
International Dutch Championships. Just try to find out for yourself what the
reasons could be.

On the contrary Kasparov with the actually highest Elo ranking number will
prefer to play a tournament like Linares. Where 5 big numbered names help to
ignore the bad number of the local hero. Because Kasparov would never play a
tournament where he loses points although he wins the tournament. Look, if
Kasparov played the city championships of Hollywood, Kasparov would LOSE points!
Elo points. Because the opponents have 2300 at best. All IMO, just an example.

The Linares effect is calle "imbreeding". With such an effect the top players
could lift - if they wanted  - each 2500 GM into the 2700 regions. That is what
happened with Judit Polgar. When it's really about winning she loses against Elo
weaker players, but in the invitational tournaments she alsohas wins against
high top ranked players. Don't ask me for the explanation of the psychology...

I think you have understood that the moment the Elo ranking as such is part of
the goal it can be misused. That is the trivial consequence of the Law of human
participation and awareness of the rules. Science has researched many methods to
exclude exactly that influence (through interaction) and it's always a tricky
game to hold the human research objects innocent. The moment the innocence is
gone the results are directly affected.

Notice that the "imbreeding" factor is active on all levels. In chess it is
always better to play in a pool of equal and better ranked players. And if you
as player look onto your rating, Elo or whatever, you know that you cannot "win"
if you win against worse players. Or you have no chance and must win with 100%.
But as it is this is difficult in chess. Fischer's boost was mainly a
consequence of his 6:0 results against Larsen and Taimanov. Against Petrosjan
and Spassky he also won but not that big. Still his 2725 was a sensation at the
time. Believe me one thing, without the imbreeding factor Kasparov would not be
above 2800. And he is surely not 100 points "better" than Fischer. To me Fischer
is the best of all times. But I digress.

To give a first summary:

If big numbers have the right to choose their prefered player buddies (see
Linares) there is no longer real competition, there is a distribution of Elo
points and money, called chess tournament. So Elo's formula does no longer
measure the performances out of real competition, but the formula itself is
misused to the benefit of big players.


Now (since we are in computerchess) what is happening in the actual SSDF? You
know the answer already.

You take the 8 or 10 top players from Linares, of course now the top programs
(that are by definition always the newest progs of the last two years), then you
let them play against each other. Imbreeding leads to always higher Elo numbers
- by definition. [Just recetly we were informed that the company ChessBase
obviously has not sent their new Fritz8 because in its first version they feared
a loss of Elo points. Since the SSDF tests the progs being sent to them, the
imbreeding factor is working in a very clean way.] In a couple of years the Elo
SSDF must reach 2900 and higher. By the (clean) imbreeding alone.
That SSDF Elo has nothing to do with chess strength is proven when 2150 players
beat the 2700 progs fair and straight which would be impossible in human chess.
[In human chess, even you won by chance a game, you would lose all the rest,
because the GM would more and more understand how to beat you at 2150.] On the
other hand the high Elo of the machines is supported by marketing results in
show events between top players who get at least 800 000 US $$ beforehand just
for showing up, no matter of the result! The result then is often a draw. That
is statistically the optimal result that does hurt the sides the less possible.
Marketing optimals! And since Elo defined draws the way that the "higher rated"
player "gave" you "his" Elo number the marketing result leads to extremely high
Elo numbers for the market product. Which is then supported by experts in CC
fora where they tell the audience that XY played as [NOT like!!] a super GM...

If you have understood then it's ok, but if you have questions, please ask,
because I cannot open such lessons on weekly base. Either now or perhaps in some
years again. So it's now or never...


Rolf Tueschen







This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.