Computer Chess Club Archives


Search

Terms

Messages

Subject: Not meaningless - just not absolute (Therefore a fake! see below) BS!

Author: Terry McCracken

Date: 06:34:26 02/14/03

Go up one level in this thread


On February 14, 2003 at 09:12:29, Rolf Tueschen wrote:

>On February 14, 2003 at 08:35:29, Albert Silver wrote:
>
>>On February 14, 2003 at 07:10:40, Rolf Tueschen wrote:
>>
>>>Just to explain some basics for new readers, I show why the whole List is
>>>worthless. The rankings are by chance the way they are presented.
>>>
>>>Since only a few here have basic knowledge in statistics I explain the most
>>>apparet things.
>>>
>>>We are told that for instance the two first programs are seperated by 8 points.
>>>No matter Stefan get all the credits here for his first place. But is true that
>>>Shredder is stronger than Fritz?
>>>
>>>Here I must tell you that we simply don't know it. The SSDF pretend to know it,
>>>but it is NOT true. How can I say such things? Easy! Look at the deviations.
>>>These numbers with + or -. We see that most programs have an expected Elo number
>>>varying plus/mius of about 30 points! Note, that the Elo minus 5 is as probable
>>>as the fially given Elo for the ranking!
>>>
>>>If you then take a look at the Elo of the opponents in the far right you can see
>>>that even for the top programs the SSDF was unable to create equal conditions.
>>>Also this influence by different opponents makes the 8 numbers difference at the
>>>top meaningless.
>>>
>>>In sum we can say that the SSDF failed to show - exactly what they pretend to
>>>show - the differences between the actual top programs. The SSDF presents a new
>>>leader, but that is against its own results! So that the conclusion is allowed
>>>that SSDF makes deliberately their own new number 1!
>>
>>Your comment that being number 1 in the list is not an absolute is completely
>>correct.
>
>Thank you and I am also please to read a message without any insults and that is
>good so. We can concentrate on the facts. But as I could see some people don't
>like that we talk about the facts too much.

You deserve insults! You wouldn't know a fact if it bit you in the @ss!
>
>
>
>
>>The SSDF doesn't claim it is a statistical absolute either,
>
>This is false. The SSDF speaks of a Number One. Of a new number one etc. Doyou
>want the evidence? Also ChessBase printed the same wording in its commercials!
>Still not believing me? It is as if you didn't want or can't understand what I
>am saying. I don't say they are cheaters. I did never say these Swedes are not
>worth called testers. I say that they make unneccessary mistakes. And I say that
>the staff there is simply not listening.
>
>You are right. If I say number one and give the deviations THEN in real I am
>saying that we have no number one. Now that is what you should ask the Swedes
>why they talk such nonsense.
>
>The Swedes? You arrogant @#$%^^ !
>
>
>> which is
>>why they present the data: rating performance, number of games, AND the error
>>margin.
>
>Yes, Albert, I knw this, and it's why I am angry. Because it's not sound. If
>they would NOT give theses details it would be more honest than giving them and
>then still claiming a number one program. When there is no such program!
>
>
>
>
>>
>>
>>     THE SSDF RATING LIST 2003-02-13   90961 games played by  251 computers
>>                                           Rating   +     -  Games   Won  Oppo
>>                                           ------  ---   --- -----   ---  ----
>>   1 Shredder 7.0  256MB Athlon 1200 MHz     2768   33   -31   547   72%  2606
>>   2 Deep Fritz 7.0  256MB Athlon 1200 MHz   2760   29   -28   654   70%  2612
>>   3 Fritz 7.0 256MB Athlon 1200 MHz         2740   30   -29   574   64%  2635
>>   4 Chess Tiger 15.0  256MB Athlon 1200 MHz 2726   27   -26   704   64%  2623
>>
>>
>>If they present the error margin, doesn't this *clearly* mean that the result
>>may be off by that much? However, so far the current performance is 2768 SSDF
>>points.
>
>
>Yes,Albert and yesterday evening, just 4 hours before 2768 they had it the other
>way round and that is the point! I see that you can't admit the consequences of
>a factual deliberate presentation. NB a presentation MUST be independant of all
>such possibilities. From its design already. Ad the argument, I heard often
>enough from SSDF, that unfortunately they had to make a break because of the
>date of publication. But this is not ok! Ok, if they had a date, THEN they
>should also tell the people that only therefore at the moment they had such and
>such. And then they should say - honestly - 1.-3. or such. But to give the
>appearance that now Shredder would be FIRST is simply FALSE.
>
>
>
>>How many games does a human play to get their rating?
>
>That is NOT the point. I will tell you what is also dishonest and false! Talking
>about the number of games, didn't you discover that Fritz 7  who is for such a
>long time on the scene they played the same number of games than with the two
>new entries Deep Fritz7 and Shredder7. So tell me please. Do they act after a
>pre-designed and fair plan or do they test on a fly to get the results perhaps
>not they themselves but a certain company wants?

Speculation...
>
>
>
>
>
>>I won't event
>>mention the ridiculously low requirement by FIDE to play only 9 games to get a
>>first rating. Suppose I had no rating and played 100 games against a 2000 Elo
>>player and I scored 75/100.
>
>
>I would not even try to compare this ridiculous SSDF Elo with the FIDE Elo.
>
 You're NOT suppose to!
>
>
>>My performance is 2200 exactly. Is it absolute? No,
>>there is a good margin of error, yet no one will question the rating and start
>>telling me I'm not rated 2200, I'm just rated anywhere between 2140 and 2260. I
>>see no difference.
>
>Yes, but I never read about "Albert now number one!" either. Only then we had
>that problem, we have with SSDF! I that so difficult?
>
 Obscure....
>
>
>>They had Shredder 7 play 547 games against other programs,
>>and presented the results PLUS the error margin. It *may* still be a fraction
>>weaker than Deep Fritz 7,
>
>
>Thank you, that is my point.

I think the readers understand that 8 pts. is meaningless....
>
>
>> but already it is clear that it performas better than
>>Chess Tiger 15 against other computers.
>
>
>Not clear from the list, but probable.

Probable? Maybe...Maybe not....Proof....A scientist needs Proof!
>
>
>
>> But even if another 200 games changed
>>the top ratings to Shredder 7 = 2762 and DF7 = 2763 would anyone be so foolish
>>as to claim one program is actually any stronger?? I certainly would never think
>>of an opponent rated 10 points more as stronger. The fact that two such
>>different playing styles achieve almost identical performances shows how rich
>>and flexible chess is.
>
>
>I have a general statement. You are completely correct. With one exception and
>that is exactly, for strange reasons, the commercial business aspect! You are
>too naive here. And I say intentionally. Because look in your message to Eduard
>you asked him if he thought that ChessBase perhaps held back Fritz8 to either
>not hurt Fritz 8 business or the Shredder business?
>
>ROFL!
>
>I would say "both"!
>
>And this is not a forbidden conclusion, it's so obvious.

Again, no PROOF! Libel!
>
>Thanks for the soud message and excuse me that I still could find the key of
>commercial interest, Albert.
>
>Rolf Tueschen
>
>>
>>                                         Albert
>>
>>>
>>>(Note please that this is not a political speech, however it is what statistics
>>>demands. The SSDF got this critic so often in the past but they still did't
>>>change their experimental setting.)
>>>
>>>Rolf Tueschen



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.