Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: New SSDF rating list

Author: Ricardo Gibert

Date: 04:44:08 08/05/00

Go up one level in this thread


On August 04, 2000 at 20:06:53, Derrick Wilson wrote:

>On August 04, 2000 at 18:51:13, Enrique Irazoqui wrote:
>
>>Just got it:
>>
>>      THE SSDF RATING LIST 2000-08-04   74012 games played by  209 computers
>>                                           Rating   +     -  Games   Won  Oppo
>>                                           ------  ---   --- -----   ---  ----
>>   1 Fritz 6.0  128MB K6-2 450 MHz           2631   28   -27   673   67%  2504
>>   2 Junior 6.0  128MB K6-2 450 MHz          2601   25   -24   864   67%  2478
>>   3 Chess Tiger 12.0 DOS 128MB K6-2 450 MHz 2573   30   -29   569   63%  2481
>>   4 Fritz 5.32  128MB K6-2 450 MHz          2553   31   -30   557   62%  2467
>>   5 Nimzo 7.32  128MB K6-2 450 MHz          2549   29   -28   613   62%  2463
>>   6 Goliath Light  128MB K6-2 450 MHz       2534   48   -48   210   51%  2528
>>   7 Hiarcs 7.32  128MB K6-2 450 MHz         2533   31   -31   519   60%  2460
>>   8 Junior 5.0  128MB K6-2 450 MHz          2526   29   -28   598   58%  2467
>>   9 SOS  128MB  K6-2 450 MHz                2516   57   -55   159   58%  2456
>>  10 Nimzo 99  128MB K6-2 450 MHz            2501   29   -29   581   54%  2475
>>  11 Crafty 17.07/CB 128MB K6-2 450 MHz      2499   27   -27   651   51%  2496
>>  12 Fritz 5.32  64MB P200 MMX               2477   20   -20  1208   57%  2429
>>  12 Hiarcs 7.32  64MB P200 MMX              2477   25   -24   815   60%  2404
>>  14 Chessmaster 6000  64MB P200 MMX         2473   61   -53   184   76%  2278
>>  15 MChess Pro 8.0  128MB K6-2 450 MHz      2470   34   -35   418   44%  2511
>>  16 Fritz 5.0 PB29%  67MB P200 MMX          2459   23   -22  1005   66%  2342
>>  17 Hiarcs 7.0  64MB P200 MMX               2458   21   -21  1106   55%  2420
>>  18 Nimzo 99  64MB P200 MMX                 2447   23   -23   885   51%  2439
>>  19 Junior 5.0  64MB P200 MMX               2433   22   -22  1010   51%  2427
>>  20 Nimzo 98  58MB P200 MMX                 2423   22   -22  1038   58%  2367
>>  21 Rebel 9.0  47MB P200 MMX                2419   24   -23   900   61%  2340
>>  22 Hiarcs 6.0  49MB P200 MMX               2417   24   -24   829   56%  2373
>>  23 Rebel 8.0  51MB P200 MMX                2409   23   -23   887   50%  2408
>>  24 MChess Pro 6.0  41MB P200 MMX           2407   26   -25   749   54%  2378
>>  25 Shredder 2.0  58MB P200 MMX             2396   21   -21  1054   48%  2408
>>  26 MChess Pro 7.1  46MB P200 MMX           2394   22   -22  1042   53%  2371
>>  27 Genius 5.0 DOS  46MB P200 MMX           2393   21   -21  1093   52%  2378
>>  28 MChess Pro 8.0  64MB P200 MMX           2390   27   -27   681   53%  2366
>>  29 Chess Tiger 11.8  Pentium 90 MHz        2387   45   -45   242   52%  2375
>>  30 Gandalf 3.0  64MB P200 MMX              2364   41   -40   307   59%  2296
>>  31 Kallisto II  64MB P200 MMX              2342   35   -35   403   52%  2327
>>  32 Rebel 9.0 Pentium 90 MHz                2334   23   -23   890   47%  2356
>>  33 Hiarcs 6.0 Pentium 90 MHz               2332   18   -18  1437   51%  2328
>>  34 Genius 5.0 DOS Pentium 90 MHz           2329   18   -18  1558   47%  2348
>>  35 MChess Pro 6.0 Pentium 90 MHz           2309   17   -17  1726   45%  2343
>>  36 Nimzo 3.5 Pentium 90 MHz                2293   22   -22   998   46%  2322
>>  37 Chessmaster 5000 Pentium 90 MHz         2287   49   -45   240   67%  2162
>>  37 Junior 4.0 Pentium 90 MHz               2287   22   -22  1035   42%  2341
>>  39 Shredder 1.0 Pentium 90 MHz             2282   59   -58   145   53%  2262
>>  40 R30 v. 2.5                              2274   41   -38   343   69%  2135
>>  41 CometA90  64MB P200 MMX                 2251   37   -39   358   36%  2351
>>  42 Fritz 4.0 Pentium 90 MHz                2234   40   -39   324   60%  2163
>>  43 WChess 1.06 Pentium 90 MHz              2230   20   -20  1222   39%  2308
>>  44 Meph Genius 68 030 33 MHz               2198   45   -44   248   55%  2161
>>  45 Berlin Pro 68 020 24 MHz                2125   24   -24   850   58%  2071
>>  45 Meph RISC 2   1 MB                      2125   62   -66   125   39%  2205
>>  47 Mephisto Montreux ARM  14 MHz 512K      2099   29   -28   689   73%  1930
>>  48 Atlanta    SH7000 20 MHz                2093   31   -29   580   67%  1967
>>  49 Sapphire II                             2013   35   -33   444   63%  1917
>>  50 Milano Pro  SH7000 20 MHz               1974   33   -32   469   61%  1895
>>
>>
>>
>> 6 Goliath Light  128MB K6-2 450 MHz, 2534
>>Junior6 K6450     12-28    Ch.Ti12 K6450      9-13    Nimz732 K6450    3.5-4.5
>>Hiar732 K6450     15-14    Nimzo99 K6450   26.5-13.5  Craf17.07 K62     23-17
>>MCP8 K6-2 450   15.5-11.5  MCP 6 P200MMX    2.5-1.5
>>
>> 9 SOS  128MB  K6-2 450 MHz, 2516
>>Hiar732 K6450      6-10    Nimzo99 K6450     10-4     Fritz532 P200   10.5-7.5
>>Hiarcs7 P200X   22.5-15.5  Junior5 P200X   18.5-13.5  190  P200MMX    23.5-13.5
>>MCP 6 P200MMX      2-2
>>
>> 15 MChess Pro 8.0  128MB K6-2 450 MHz, 2470
>>Fritz6 K6-450      9-35    Junior6 K6450     15-25    204  K6-450      3.5-4.5
>>Nimz732 K6450     19-25    Goliath K6450   11.5-15.5  Hiar732 K6450      1-4
>>Junior5 K6450   15.5-24.5  Nimzo99 K6450   20.5-21.5  Craf17.07 K62      7-15
>>Fritz532 P200   17.5-28.5  Hiarcs7 P200X     21-19    Junior5 P200X   16.5-3.5
>>193  P200MMX    27.5-12.5
>>
>>
>>
>>
>>The SSDF rating list provides information about
>>the relative strength of chess programs, when
>>tested in the way SSDF does, but does not
>>necessarily say which ELO-rating a certain program
>>would achieve after having played hundreds of
>>tournament games against human players.
>>
>>How good or bad the individual correlation
>>between SSDF- and ELO-ratings is, will most
>>likely never be established. So many games against
>>humans will never be played.
>>
>>Apart from establishing relative ratings, we have had
>>the ambition that the general level of the list
>>would be fairly realistic, compared to human ratings.
>>From our start in 1984 we have used tournament games
>>against Swedish chess players to calibrate the list.
>>At some points we have discarded older games, believing
>>that human chess players with time have become better
>>to exploit the weaknesses of chess programs.
>>
>>Until the latest rating list the level of the list has
>>been unchanged from summer 1991, and was based on 337
>>tournament games against Swedish players between 1987 and
>>1991. Regrettably it has not been possible for us to
>>play any more games for many years now.
>>
>>For some time we had the general impression that
>>the level of the list was rather OK. But during the
>>latest years it has become more and more obvious that
>>the best programs on the latest hardware don't
>>get as high ELO-ratings as our list could be interpreted
>>to predict.
>>
>>If this is due to differences between Swedish- and ELO-
>>ratings, to the "human learning effect", to some kind of
>>"spreading effect" in a computer-computer list or a com-
>>bination of these and perhaps other factors, we don't know.
>>
>>It is difficult to find a perfect solution, but we have
>>chosen to correlate the level of the list to the results
>>of tournament games between computers and ELO-rated
>>humans, played during the latest years. For us it has
>>been very convenient to use Chris Carsons compilation
>>of such games. Calculations based on these games indicate
>>that the level of the list is about 100 points too high.
>>So from now on we have lowered the list with 100 points!
>>
>>Our hope is that the SSDF-ratings of the top entrants as
>>a group now are better correlated with ELO-ratings. If
>>the rating-inflation to a large part is due to
>>a "spreading-effect", there is now a certain possibility
>>that the older and weaker entrants of the list would play
>>better against humans than their SSDF-ratings could indi-
>>cate. But having to choose, we prefer to secure that the
>>top programs have as correct ratings as possible.
>>
>>It is interesting to see how well chess programs play against
>>each other, but it's even more fascinating to see what they
>>can achieve against humans! I hope that more games against
>>strong humans will be played in the future, and that
>>Chris Carson will continue to collect these games, so that
>>the level of the SSDF list can be more securely established.
>>
>>Compared to the latest rating list in early April we now
>>have 1953 more tournament games and three new entrants.
>>Marty Hirschs MChess Pro 8.0 has been replayed on K6-2 450 MHz.
>>After 418 games it has 2470, which is 80 points more than
>>on Pentium MMX 200 MHz. The difference between these two
>>hardwares has in average been 79 points, so the result
>>is as expected.
>>
>>Completely new on the list is Michael Borgstädts
>>Goliath Light K6-2 450 MHz. It is played under the Fritz
>>surface using the opening book general.ctg from Fritz 6.
>>It has got a rating of 2534, which gives it a sixth
>>place on the list!
>>
>>SOS K6-2 450 MHz is written by Rudolf Huber and is also
>>played with the opening book of Fritz 6. After 159 games
>>it has 2516 and a ninth place!
>>
>>Crafty 17.07 K6-2 has lost 24 points compared to the
>>latest list and Nimzo 99 K6-2 has 22 points less.
>>Fritz 6 K6-2 has gone up 10 points and Junior 6 K6-2
>>has increased 12 points.
>>
>>Next official list will be made in September or
>>October.
>>
>>Thoralf Karlsson
>
>
> I don't believe this list for a second!! Consider this, on a pent 200 the list
>states that hiarcs6 is only 2417, yet the same program on that hardware defeated
>2495 dean hergott in a six game match!!

This was explained in Karlsson's post. I will repeat the relavant paragraph:

"Our hope is that the SSDF-ratings of the top entrants as
a group now are better correlated with ELO-ratings. If
the rating-inflation to a large part is due to
a "spreading-effect", there is now a certain possibility
that the older and weaker entrants of the list would play
better against humans than their SSDF-ratings could indi-
cate. But having to choose, we prefer to secure that the
top programs have as correct ratings as possible."



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.