Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Comments of latest SSDF list - Nine basic questions

Author: Rolf Tueschen

Date: 03:29:48 06/04/02

Go up one level in this thread


On June 04, 2002 at 04:50:45, Peter Fendrich wrote:

>On June 03, 2002 at 22:20:08, Allen Lake wrote:
>
>>On June 03, 2002 at 20:39:35, Dann Corbit wrote:
>>
>>>There are lies, damn lies, and statistics.  The big problem with statistics is
>>>that 99% of the world's population has no idea what they might possibly mean.
>>>Therefore, when they see them, they draw all sorts of incorrect assertions from
>>>them.
>>
>>I wondered when somebody was going to say this :)  And I applaud your example of
>>the two lists below:
>>
>>>    Program            Elo    +   -   Games   Score   Av.Op.  Draws
>>>
>>>  1 LG2000V3         : 2589   97 197    31    77.4 %   2375    6.5 %
>>>  2 Yace 0.99.50     : 2586   31 104   188    86.2 %   2268   12.8 %
>>>  3 MAD-005          : 2583   37 110   145    83.1 %   2306   10.3 %
>>>  4 Crafty-18.10     : 2580  102 138    29    75.9 %   2381   27.6 %
>>>  5 Comet-B37        : 2554  103 129    31    71.0 %   2398   25.8 %
>>>  6 TCBishop-4601    : 2542  103 165    30    73.3 %   2367   13.3 %
>>>  7 Gromit3          : 2527   99 100    36    66.7 %   2407   33.3 %
>>>  8 Nejmet-260       : 2526  109 163    28    71.4 %   2367   14.3 %
>>>  9 Phalanx-xxii     : 2522  102 153    33    68.2 %   2390    9.1 %
>>> 10 AnMon-509        : 2518   33  83   195    81.0 %   2266   14.4 %
>>> 11 Amy-07           : 2507   36 106   157    82.5 %   2238    9.6 %
>>> 12 TCBishop-0045    : 2503  112 130    29    67.2 %   2378   24.1 %
>>> 13 AnMon-510        : 2497   36  98   162    81.5 %   2240   11.1 %
>>> 14 ZChess-222       : 2492   31  75   230    78.9 %   2263   12.6 %
>>> 15 GLC-213          : 2469  112 120    33    60.6 %   2395   18.2 %
>>> 16 ZChess-120       : 2452   35  70   194    74.7 %   2264   14.4 %
>>> 17 Gromit2          : 2434  125  81    32    53.1 %   2412   43.8 %
>>> 18 Pepito-121       : 2432  126 117    28    58.9 %   2369   25.0 %
>>> 19 Ant-606          : 2429  110 110    37    56.8 %   2382   16.2 %
>>> 20 FranWB-090       : 2427   36  63   202    71.5 %   2267   15.3 %
>>>
>>>    Program            Elo    +   -   Games   Score   Av.Op.  Draws
>>>
>>>  1 LG2000V3         :  289   97 197    31    77.4 %     75    6.5 %
>>>  2 Yace 0.99.50     :  286   31 104   188    86.2 %    -32   12.8 %
>>>  3 MAD-005          :  283   37 110   145    83.1 %      6   10.3 %
>>>  4 Crafty-18.10     :  280  102 138    29    75.9 %     81   27.6 %
>>>  5 Comet-B37        :  254  103 129    31    71.0 %     98   25.8 %
>>>  6 TCBishop-4601    :  242  103 165    30    73.3 %     67   13.3 %
>>>  7 Gromit3          :  227   99 100    36    66.7 %    107   33.3 %
>>>  8 Nejmet-260       :  226  109 163    28    71.4 %     67   14.3 %
>>>  9 Phalanx-xxii     :  222  102 153    33    68.2 %     90    9.1 %
>>> 10 AnMon-509        :  218   33  83   195    81.0 %    -34   14.4 %
>>> 11 Amy-07           :  207   36 106   157    82.5 %    -62    9.6 %
>>> 12 TCBishop-0045    :  203  112 130    29    67.2 %     78   24.1 %
>>> 13 AnMon-510        :  197   36  98   162    81.5 %    -60   11.1 %
>>> 14 ZChess-222       :  192   31  75   230    78.9 %    -37   12.6 %
>>> 15 GLC-213          :  169  112 120    33    60.6 %     95   18.2 %
>>> 16 ZChess-120       :  152   35  70   194    74.7 %    -36   14.4 %
>>> 17 Gromit2          :  134  125  81    32    53.1 %    112   43.8 %
>>> 18 Pepito-121       :  132  126 117    28    58.9 %     69   25.0 %
>>> 19 Ant-606          :  129  110 110    37    56.8 %     82   16.2 %
>>> 20 FranWB-090       :  127   36  63   202    71.5 %    -33   15.3 %
>>>
>>>Notice (however) that the highest ELO is 2589 in the first list and 289 in the
>>>second.  Yet this is irrelevant.  The only thing that matters are the
>>>differences.
>>
>>I know you've tried long and hard to make this point over the months I've been
>>reading here, Dann.  Unfortunately, that second type of list, no matter how
>>statistically valid, is not going to "sell" many chess programs -- and isn't
>>that what most of the major rating lists are _really_ about?  This is not to say
>>that SSDF and its list don't provide a valuable consumer resource concerning the
>>commercial chess programs.
>
>I'm not sure what you mean by that but SSDF does compute ratings soley based on
>the differences between the players. The list is in the next step adjusted based
>on some games between programs vs humans. The level of the list is intended to
>be of about the same level as the FIDE list. They will of course not be be
>interchangeable and can never be but it gives some feeling of where the level is
>compared to humans.

No way, Peter! As you write it was about "some" games. This is _not_
correct calibrating. At that time you did that programs would have lost to 100%
against experts, masters, IM and GM. But then afterwards you started the comp vs
comp fantasy. And today you claim GM level on the basis of "some" games against
patzers. I don't say that you did intend that ast the beginning. But it's the
description of the historical truth.

Rolf Tueschen

>
>Peter
>
>
>>P.S.  I realize that all of the programs you had on your two lists are freely
>>available.  It's still a wonderful example!



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.