Author: Rolf Tueschen
Date: 06:34:41 02/15/03
Go up one level in this thread
On February 15, 2003 at 07:08:52, Albert Silver wrote: >On February 15, 2003 at 04:52:44, David Dory wrote: > >>On February 14, 2003 at 13:32:16, Tony Hedlund wrote: >> >>>On February 14, 2003 at 09:27:26, Rolf Tueschen wrote: >>> >>>>On February 14, 2003 at 08:43:12, Bob Durrett wrote: >>>> >>>>> >>>>>Excellent points. The "bottom line" is that SSDF presented their findings >>>>>properly, but the problem is in interpretation. SSDF cannot be held responsible >>>>>for errors in interpretation. >>>>> >>>>>Bob D. >>>> >>>> >>>>Wrong conclusion. I tried to explain the points but apparently it's a bit too >>>>difficult. In short : If you use a system of statistics you are not allowed to >>>>make your own presentation. The presentation by SSDF is FALSE. That is the >>>>point. False and unallowed. Instead of 1., 2., 3., they should say 1.-3., not >>>>should, but must, if the differences in the actual results are way smaller than >>>>the error in the tests itself. Is that impossible to understand? >>>> >>>>Rolf Tueschen >>> >>>Then the right presentation is: >>> >>>1-10 Shredder 7 2801-2737 >>>1-10 Deep Fritz 7 2789-2732 >>>1-11 Fritz 7 2770-2711 >>>1-2? Shredder 7 UCI 2761-2638 >>>1-15 Chess Tiger 15 2753-2700 >>>1-15 Shredder 6 Pad UCI 2750-2703 >>>1-16 Shredder 6 2750-2689 >>>1-19 Chess Tiger 14 2744-2684 >>>1-19 Deep Fritz 2741-2680 >>>1-19 Gambit Tiger 2 2739-2681 >>>3-2? Junior 7 2715-2659 >>>4-2? Hiarcs 8 2707-2657 >>> >>>and so on. >>> >>>Tony >> >>Oh Good Grief! >>Yes, I have to say I actually agree with Rolf. The SSDF should NOT try to select >>a number one UNLESS they have played enough games to be sure they have the right >>program selected, taking into account the margin of error. > >I don't agree. The SSDF present their findings and that's it. The findings show >how well a program did against other programs. After hundreds of games they show >the *current* rating (it changes as more results are added) of the program as >well as the number of games, individual results, and the margin of error. The >results are presented according to the highest to lowest rating. There is no >'selection' of the top program. What would you have them do? Present it in >alphabetical order? Furthermore, the best program against humans may easily not >be the best program against other programs. > > Albert The question "Present them in alphabetical order?" shows the complete lack of understanding statistics and also the unwillingness to digest the messages already made. I said what should/must be done. This is not up to them but a logic of statistics itself. Now that must hurt people who think that all is a question of best selling management. I would never attack you personally, you might be a fine person, but you have no idea of such necessities of science. And NO! You can't simply react and say "But they are no scientists!" although this is correct. The point is that you are not allowed to adopt a certain routine from science ad then quickly forgetting about the clearly defined context of such routines. I try to make that point for years by now. Without much success. And "FIDE lists" is surely no way-out! In FIDE you have at least a relative stability [over the years] of what you want to measure. But that is exactly the point why the adoption of Elo doesn't work for the always new seasonal flash in the pan. <cough> Rolf Tueschen > >> >>I'm sure this is a nod in the direction of marketing hype, but for commercial >>chess programs, the marketing force HAS to be very strong, otherwise the program >>probably would not exist for long. >> >>You have a point Rolf, but it will be buried by market hype, and that's life. >>The whole SSDF rating work perhaps can best be thought of as a longer tournament >>- ie., the strongest program may not win the top spot (because enough games are >>not played to differentiate all the programs), but that's tournament life. >> >>Welcome to SSDF life. All in all, you have to really appreciate their work, if >>not every little aspect of how they present their findings. >> >> >>Dave
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.