Author: Albert Silver
Date: 04:08:52 02/15/03
Go up one level in this thread
On February 15, 2003 at 04:52:44, David Dory wrote: >On February 14, 2003 at 13:32:16, Tony Hedlund wrote: > >>On February 14, 2003 at 09:27:26, Rolf Tueschen wrote: >> >>>On February 14, 2003 at 08:43:12, Bob Durrett wrote: >>> >>>> >>>>Excellent points. The "bottom line" is that SSDF presented their findings >>>>properly, but the problem is in interpretation. SSDF cannot be held responsible >>>>for errors in interpretation. >>>> >>>>Bob D. >>> >>> >>>Wrong conclusion. I tried to explain the points but apparently it's a bit too >>>difficult. In short : If you use a system of statistics you are not allowed to >>>make your own presentation. The presentation by SSDF is FALSE. That is the >>>point. False and unallowed. Instead of 1., 2., 3., they should say 1.-3., not >>>should, but must, if the differences in the actual results are way smaller than >>>the error in the tests itself. Is that impossible to understand? >>> >>>Rolf Tueschen >> >>Then the right presentation is: >> >>1-10 Shredder 7 2801-2737 >>1-10 Deep Fritz 7 2789-2732 >>1-11 Fritz 7 2770-2711 >>1-2? Shredder 7 UCI 2761-2638 >>1-15 Chess Tiger 15 2753-2700 >>1-15 Shredder 6 Pad UCI 2750-2703 >>1-16 Shredder 6 2750-2689 >>1-19 Chess Tiger 14 2744-2684 >>1-19 Deep Fritz 2741-2680 >>1-19 Gambit Tiger 2 2739-2681 >>3-2? Junior 7 2715-2659 >>4-2? Hiarcs 8 2707-2657 >> >>and so on. >> >>Tony > >Oh Good Grief! >Yes, I have to say I actually agree with Rolf. The SSDF should NOT try to select >a number one UNLESS they have played enough games to be sure they have the right >program selected, taking into account the margin of error. I don't agree. The SSDF present their findings and that's it. The findings show how well a program did against other programs. After hundreds of games they show the *current* rating (it changes as more results are added) of the program as well as the number of games, individual results, and the margin of error. The results are presented according to the highest to lowest rating. There is no 'selection' of the top program. What would you have them do? Present it in alphabetical order? Furthermore, the best program against humans may easily not be the best program against other programs. Albert > >I'm sure this is a nod in the direction of marketing hype, but for commercial >chess programs, the marketing force HAS to be very strong, otherwise the program >probably would not exist for long. > >You have a point Rolf, but it will be buried by market hype, and that's life. >The whole SSDF rating work perhaps can best be thought of as a longer tournament >- ie., the strongest program may not win the top spot (because enough games are >not played to differentiate all the programs), but that's tournament life. > >Welcome to SSDF life. All in all, you have to really appreciate their work, if >not every little aspect of how they present their findings. > > >Dave
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.