Author: Terry McCracken
Date: 06:34:26 02/14/03
Go up one level in this thread
On February 14, 2003 at 09:12:29, Rolf Tueschen wrote: >On February 14, 2003 at 08:35:29, Albert Silver wrote: > >>On February 14, 2003 at 07:10:40, Rolf Tueschen wrote: >> >>>Just to explain some basics for new readers, I show why the whole List is >>>worthless. The rankings are by chance the way they are presented. >>> >>>Since only a few here have basic knowledge in statistics I explain the most >>>apparet things. >>> >>>We are told that for instance the two first programs are seperated by 8 points. >>>No matter Stefan get all the credits here for his first place. But is true that >>>Shredder is stronger than Fritz? >>> >>>Here I must tell you that we simply don't know it. The SSDF pretend to know it, >>>but it is NOT true. How can I say such things? Easy! Look at the deviations. >>>These numbers with + or -. We see that most programs have an expected Elo number >>>varying plus/mius of about 30 points! Note, that the Elo minus 5 is as probable >>>as the fially given Elo for the ranking! >>> >>>If you then take a look at the Elo of the opponents in the far right you can see >>>that even for the top programs the SSDF was unable to create equal conditions. >>>Also this influence by different opponents makes the 8 numbers difference at the >>>top meaningless. >>> >>>In sum we can say that the SSDF failed to show - exactly what they pretend to >>>show - the differences between the actual top programs. The SSDF presents a new >>>leader, but that is against its own results! So that the conclusion is allowed >>>that SSDF makes deliberately their own new number 1! >> >>Your comment that being number 1 in the list is not an absolute is completely >>correct. > >Thank you and I am also please to read a message without any insults and that is >good so. We can concentrate on the facts. But as I could see some people don't >like that we talk about the facts too much. You deserve insults! You wouldn't know a fact if it bit you in the @ss! > > > > >>The SSDF doesn't claim it is a statistical absolute either, > >This is false. The SSDF speaks of a Number One. Of a new number one etc. Doyou >want the evidence? Also ChessBase printed the same wording in its commercials! >Still not believing me? It is as if you didn't want or can't understand what I >am saying. I don't say they are cheaters. I did never say these Swedes are not >worth called testers. I say that they make unneccessary mistakes. And I say that >the staff there is simply not listening. > >You are right. If I say number one and give the deviations THEN in real I am >saying that we have no number one. Now that is what you should ask the Swedes >why they talk such nonsense. > >The Swedes? You arrogant @#$%^^ ! > > >> which is >>why they present the data: rating performance, number of games, AND the error >>margin. > >Yes, Albert, I knw this, and it's why I am angry. Because it's not sound. If >they would NOT give theses details it would be more honest than giving them and >then still claiming a number one program. When there is no such program! > > > > >> >> >> THE SSDF RATING LIST 2003-02-13 90961 games played by 251 computers >> Rating + - Games Won Oppo >> ------ --- --- ----- --- ---- >> 1 Shredder 7.0 256MB Athlon 1200 MHz 2768 33 -31 547 72% 2606 >> 2 Deep Fritz 7.0 256MB Athlon 1200 MHz 2760 29 -28 654 70% 2612 >> 3 Fritz 7.0 256MB Athlon 1200 MHz 2740 30 -29 574 64% 2635 >> 4 Chess Tiger 15.0 256MB Athlon 1200 MHz 2726 27 -26 704 64% 2623 >> >> >>If they present the error margin, doesn't this *clearly* mean that the result >>may be off by that much? However, so far the current performance is 2768 SSDF >>points. > > >Yes,Albert and yesterday evening, just 4 hours before 2768 they had it the other >way round and that is the point! I see that you can't admit the consequences of >a factual deliberate presentation. NB a presentation MUST be independant of all >such possibilities. From its design already. Ad the argument, I heard often >enough from SSDF, that unfortunately they had to make a break because of the >date of publication. But this is not ok! Ok, if they had a date, THEN they >should also tell the people that only therefore at the moment they had such and >such. And then they should say - honestly - 1.-3. or such. But to give the >appearance that now Shredder would be FIRST is simply FALSE. > > > >>How many games does a human play to get their rating? > >That is NOT the point. I will tell you what is also dishonest and false! Talking >about the number of games, didn't you discover that Fritz 7 who is for such a >long time on the scene they played the same number of games than with the two >new entries Deep Fritz7 and Shredder7. So tell me please. Do they act after a >pre-designed and fair plan or do they test on a fly to get the results perhaps >not they themselves but a certain company wants? Speculation... > > > > > >>I won't event >>mention the ridiculously low requirement by FIDE to play only 9 games to get a >>first rating. Suppose I had no rating and played 100 games against a 2000 Elo >>player and I scored 75/100. > > >I would not even try to compare this ridiculous SSDF Elo with the FIDE Elo. > You're NOT suppose to! > > >>My performance is 2200 exactly. Is it absolute? No, >>there is a good margin of error, yet no one will question the rating and start >>telling me I'm not rated 2200, I'm just rated anywhere between 2140 and 2260. I >>see no difference. > >Yes, but I never read about "Albert now number one!" either. Only then we had >that problem, we have with SSDF! I that so difficult? > Obscure.... > > >>They had Shredder 7 play 547 games against other programs, >>and presented the results PLUS the error margin. It *may* still be a fraction >>weaker than Deep Fritz 7, > > >Thank you, that is my point. I think the readers understand that 8 pts. is meaningless.... > > >> but already it is clear that it performas better than >>Chess Tiger 15 against other computers. > > >Not clear from the list, but probable. Probable? Maybe...Maybe not....Proof....A scientist needs Proof! > > > >> But even if another 200 games changed >>the top ratings to Shredder 7 = 2762 and DF7 = 2763 would anyone be so foolish >>as to claim one program is actually any stronger?? I certainly would never think >>of an opponent rated 10 points more as stronger. The fact that two such >>different playing styles achieve almost identical performances shows how rich >>and flexible chess is. > > >I have a general statement. You are completely correct. With one exception and >that is exactly, for strange reasons, the commercial business aspect! You are >too naive here. And I say intentionally. Because look in your message to Eduard >you asked him if he thought that ChessBase perhaps held back Fritz8 to either >not hurt Fritz 8 business or the Shredder business? > >ROFL! > >I would say "both"! > >And this is not a forbidden conclusion, it's so obvious. Again, no PROOF! Libel! > >Thanks for the soud message and excuse me that I still could find the key of >commercial interest, Albert. > >Rolf Tueschen > >> >> Albert >> >>> >>>(Note please that this is not a political speech, however it is what statistics >>>demands. The SSDF got this critic so often in the past but they still did't >>>change their experimental setting.) >>> >>>Rolf Tueschen
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.