Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF Rating List 03-02-13 (Meaningless ranking!)

Author: Rolf Tueschen

Date: 05:10:26 02/14/03

Go up one level in this thread


On February 14, 2003 at 07:23:56, Uri Blass wrote:

>On February 14, 2003 at 07:10:40, Rolf Tueschen wrote:
>
>>Just to explain some basics for new readers, I show why the whole List is
>>worthless. The rankings are by chance the way they are presented.
>>
>>Since only a few here have basic knowledge in statistics I explain the most
>>apparet things.
>>
>>We are told that for instance the two first programs are seperated by 8 points.
>>No matter Stefan get all the credits here for his first place. But is true that
>>Shredder is stronger than Fritz?
>
>I do not know but it is better guess to guess that shredder7 is better.
>
>I wrote in my response:
>"This time it seems that sendro was right when he guessed that Shredder7 is
>better than Fritz7 and I understood that it came to be number 1 inspite of not
>using the classic interface."
>
>
>In case that I was sure I could avoid the words "it seems".
>>
>>Here I must tell you that we simply don't know it. The SSDF pretend to know it,
>>but it is NOT true. How can I say such things? Easy! Look at the deviations.
>>These numbers with + or -. We see that most programs have an expected Elo number
>>varying plus/mius of about 30 points! Note, that the Elo minus 5 is as probable
>>as the fially given Elo for the ranking!
>>
>>If you then take a look at the Elo of the opponents in the far right you can see
>>that even for the top programs the SSDF was unable to create equal conditions.
>>Also this influence by different opponents makes the 8 numbers difference at the
>>top meaningless.
>
>I do not think that it is meaningless.
>
>I give shredder bigger probability than Fritz to be number 1 based on my
>knowledge.
>
>The probability may be only 55% (I did not try to build a model to calculate it)
>but it is enough for me to say that shredder7 seems to be better than Deep
>fritz7.
>
>Shredder7 is probably better than Deep Fritz7 in other words.
>
>>
>>In sum we can say that the SSDF failed to show - exactly what they pretend to
>>show - the differences between the actual top programs. The SSDF presents a new
>>leader, but that is against its own results! So that the conclusion is allowed
>>that SSDF makes deliberately their own new number 1!
>>
>>(Note please that this is not a political speech, however it is what statistics
>>demands. The SSDF got this critic so often in the past but they still did't
>>change their experimental setting.)
>>
>>Rolf Tueschen
>
>I believe that the ssdf did not choose opponents for shredder7 in order to do it
>number 1.
>
>There is a statistical error and other errors in the rating but we have no way
>to know what is the direction of the errors.

You are confusing things. Of course Shredder could be better. Sandro must know
his business. You too. But that is not the point, Uri. The point is, if the
actuallist could tell us anything about existing differences between the top
progs. And I say No! The list is nonsense because they stopped testing when it
had to be continued to make a meaningful ranking. But SSDF is so arrogant or
ifnorant that they argue that tests will go on but the date of the list was
already defined long before. Well, in other words they are happy to present a
ranking against their own test results. Because, Uri, this is important, it is
simply unallowed to present such a _ranking_ with deviations of +/- 30 and the
differences of 8 points. This is not a question of belief in what I say. Just
take a reader about stats. NB it's also not the point to go into wordplays "but
this is a list and not a ranking list".

But to understand all that it is neccessary to know something about stats. The
fake or impostering is the same as if I would show up with calculations and
results of 4,0000 for 4. And then I put 4,0000 in the top place in front of mere
4. It's simply false. It's impostering a precision, that is not existent in a
world with clean numbers (without decimal numbers).

Rolf Tueschen
>
>Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.