Author: Uri Blass
Date: 19:20:02 05/30/02
Go up one level in this thread
On May 30, 2002 at 21:55:21, Rolf Tueschen wrote: >On May 27, 2002 at 02:08:01, Bertil Eklund wrote: > >>On May 26, 2002 at 21:05:21, Rolf Tueschen wrote: >> >>>On May 26, 2002 at 20:24:54, Robin Smith wrote: >>> >>>>On May 26, 2002 at 08:15:53, Rolf Tueschen wrote: >>>> >>>>>Excuse me, but you are mixing up ranking in tournament practice in a sport and >>>>>testing procedures. >>>>> >>>>>Rolf Tueschen >>>> >>>>Yes. I know. But what difference does it make? Play some games. Calculate >>>>ratings. Publish ratings. That one is done as "sport" and the other "testing" >>>>doesn't change the fundamental method for calculating and publishing ratings. >>>>The only real difference I see is that SSDF includes error bars and FIDE does >>>>not. Perhaps you would like it better if SSDF didn't include the error bars? >>>> >>>>:-) >>>> >>>>Robin >>> >>>Yes, it would be much better, but there are still better ways to make me really >>>happy with SSDF. :) >>> >>>Rolf Tueschen >> >>Yes, we all know them but you don't. >> >>Bertil > >Since I had promissed a few people to write a critical summary about SSDF >ranking I started with a German version. From this article in Rolfs Mosaik (it's >the number 8 there) I'll quote here the following questions. The problem is, >that the critic is rather short in effect, but for most of the aspects I have no >exact information that is why I wrote the nine questions for the beginning of a >communication. My verdict however is already that the list has no validity. The >whole presentation has a long tradition but no rational meaning. However SSDF >could well make several changes and give the list a better foundation. > >[This is the final part of the article number 8] > >My translation: > ># Stats could only help to establish roughly correct numbers on a valid basis, >but without validity the Elo numbers resemble the >fata morgana that appears to those who are thirsty in the desert. [Footnote: In >my first part I explained that the typical Elo numbers with 2500, 2600 or 2700 >are adjusted to human players, a big pool of human players, not just 15 or 20 >players! So SSDF simply has no validity at all.] > ># What is wrong in the SSDF stats besides the lacking validity. > ># To answer this we clarify what is characteristic for a chess program. > ># Hardware > Engine > Books > Learning tool > ># What is necessary for a test experiment? >Briefly - the control of these four factors/ parameters. > ># But at first we define, what we want to measure respectevely what should be >the result. > ># We want to know, how successful the conmbination of Hardware, Engine, Books >and Learning tool is playing. Successful play is called strength. > ># Here follows a list of simple questions. > ># 1) SSDF equips each time the new programs with the fastest hardware. Do we >find out this way if the new engine is stronger than the older? No! Quite simply >because the old engines could be as strong or stronger on new hardware. There are some programs that use more than one hardware in the ssdf list so we can know the expected increase in rating from the new hardware. Things may be different for different programs but if the fidderence is big then we are going to know if the new programs are better than the old programs. <snipped> ># 6) SSDF often matches newest progs vs ancient progs. Why? Because the >variability of the choice of the opponent is important for the calculation of >Elo numbers? Hence Kasparov is playing against a master player of about Elo >2350? Of course not! Such nonsense is not part of human chess [as necessity of >Elo numbers!]! Or is it that the lacking validity of the computer should be >replaced by the play against weakest and helpless opponents? We don't know. The ssdf does not play matches between programs that the difference in rating is more than 400 elo and kasparov plays against players that are clearly weaker than him. Uri
This page took 0.03 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.