Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF Rating list

Author: Dann Corbit

Date: 12:06:43 06/12/01

Go up one level in this thread


On June 12, 2001 at 14:48:10, Thoralf Karlsson wrote:

>  THE SSDF RATING LIST 2001-06-11   79042 games played by  219 computers
>                                           Rating   +     -  Games   Won  Oppo
>                                           ------  ---   --- -----   ---  ----
>   1 Deep Fritz  128MB K6-2 450 MHz          2653   29   -28   647   64%  2551

Congratulations to the Fritz team for a fabulous chess engine.  To top the SSDF
list is an incredible achievement that shows definite high quality.

>   2 Gambit Tiger 2.0  128MB K6-2 450 MHz    2650   43   -40   302   67%  2528
>   3 Chess Tiger 14.0 CB 128MB K6-2 450 MHz  2632   43   -40   308   67%  2508

Two tigers are right on Fritz's tail!  Gambit Tiger (in particular) has a mean
ELO just 3 points lower.  Considering the size of the error bar, I expect a big
dogfight (catfight?) to see who can chin the bar the most times.

>   4 Fritz 6.0  128MB K6-2 450 MHz           2623   23   -23   968   64%  2520
>   5 Junior 6.0  128MB K6-2 450 MHz          2596   20   -20  1230   62%  2509

I guess that Deep Junior has not been tested yet?  Probably too new.

>   6 Chess Tiger 12.0 DOS 128MB K6-2 450 MHz 2576   26   -26   733   61%  2499
>   7 Fritz 5.32  128MB K6-2 450 MHz          2551   25   -25   804   58%  2496
>   8 Nimzo 7.32  128MB K6-2 450 MHz          2550   24   -23   897   58%  2491
>   9 Nimzo 8.0  128MB K6-2 450 MHz           2542   28   -28   612   54%  2511
>  10 Junior 5.0  128MB K6-2 450 MHz          2534   25   -25   790   58%  2478
>  11 Gandalf 4.32f  128MB K6-2 450 MHz       2531   28   -28   627   51%  2524
>  12 Hiarcs 7.32  128MB K6-2 450 MHz         2525   27   -27   679   56%  2482
>  13 SOS  128MB  K6-2 450 MHz                2521   22   -22  1022   52%  2508
>  13 Hiarcs 7.01  128MB K6-2 450 MHz         2521   34   -34   419   46%  2550
>  15 Rebel Century 3.0  128MB K6-2 450 MHz   2518   30   -30   546   49%  2524
>  16 Chessmaster 8000  128MB K6-2 450 MHz    2502   50   -52   191   42%  2560

Now that we have the facts and figures in, I see that Chessmaster 8000 is right
where a mathematical prediction would land it.  On hardware of approximately
half the speed, the ELO is (2502-2473)= 29 ELO difference.  Considering the
uncertainty intervals, this is remarkably good agreement to expectation.  I
think (perhaps) the Chessmaster people have not put nearly so much attention
into their opening book as the Fritz folks.  This is just a hunch, but I suspect
a superior book would be very helpful.

>  17 Goliath Light  128MB K6-2 450 MHz       2497   28   -28   628   44%  2539
>  18 Nimzo 99  128MB K6-2 450 MHz            2489   24   -24   826   49%  2493
>  19 Crafty 17.07/CB 128MB K6-2 450 MHz      2487   24   -24   857   47%  2506
>  20 Fritz 5.32  64MB P200 MMX               2478   18   -18  1473   53%  2455
>  21 MChess Pro 8.0  128MB K6-2 450 MHz      2477   29   -30   557   43%  2525
>  22 Chessmaster 6000  64MB P200 MMX         2473   61   -53   184   76%  2278
>  22 Hiarcs 7.32  64MB P200 MMX              2473   23   -22   970   55%  2435
>  24 Fritz 5.0 PB29%  67MB P200 MMX          2459   23   -22  1005   66%  2342
>  24 Hiarcs 7.0  64MB P200 MMX               2459   21   -21  1112   55%  2420
>  26 Nimzo 99  64MB P200 MMX                 2446   23   -23   885   51%  2439
>  27 Junior 5.0  64MB P200 MMX               2432   19   -20  1280   47%  2454
>  28 Nimzo 98  58MB P200 MMX                 2426   21   -21  1126   56%  2380
>  29 Rebel 9.0  47MB P200 MMX                2421   24   -23   900   61%  2342
>  30 Hiarcs 6.0  49MB P200 MMX               2417   24   -24   829   56%  2373
>  31 Rebel 8.0  51MB P200 MMX                2409   22   -22   971   48%  2424
>  32 MChess Pro 6.0  41MB P200 MMX           2406   24   -24   831   52%  2393
>  33 Shredder 2.0  58MB P200 MMX             2401   20   -20  1242   46%  2433
>  34 MChess Pro 7.1  46MB P200 MMX           2394   22   -22  1042   53%  2371
>  35 Genius 5.0 DOS  46MB P200 MMX           2390   20   -20  1177   50%  2390
>  35 MChess Pro 8.0  64MB P200 MMX           2390   27   -27   681   53%  2367
>  37 Chess Tiger 11.8  Pentium 90 MHz        2382   43   -43   261   50%  2383
>  38 Gandalf 3.0  64MB P200 MMX              2364   41   -40   307   59%  2297
>  39 Kallisto II  64MB P200 MMX              2343   35   -35   403   52%  2328
>  40 Rebel 9.0 Pentium 90 MHz                2335   23   -23   890   47%  2356
>  41 Junior 4.0 Pentium 90 MHz               2287   22   -22  1035   42%  2341
>  42 Shredder 1.0 Pentium 90 MHz             2282   59   -58   145   53%  2263
>  43 R30 v. 2.5                              2274   41   -38   343   69%  2135
>  44 Meph Genius 68 030 33 MHz               2198   45   -44   248   55%  2161
>  45 Berlin Pro 68 020 24 MHz                2125   24   -24   850   58%  2071
>  45 Meph RISC 2   1 MB                      2125   62   -66   125   39%  2205
>  47 Mephisto Montreux ARM  14 MHz 512K      2099   29   -28   689   73%  1930
>  48 Atlanta    SH7000 20 MHz                2090   29   -28   647   69%  1949
>  49 Sapphire II                             2012   35   -33   444   63%  1916
>  50 Milano Pro  SH7000 20 MHz               1974   33   -32   469   61%  1895
>
>
>
> 2 Gambit Tiger 2.0  128MB K6-2 450 MHz, 2650
>DpFritz K6450     20-22    Fritz6 K6-450     16-13    Junior6 K6450   21.5-18.5
>Hiarcs7 K6450      9-7     Nimzo99 K6450   10.5-4.5   Fritz532 P200     37-11
>MCP8 K6-2 450     30-10    Hiar732 P200X     23-9     Junior5 P200X   34.5-5.5
>
> 3 Chess Tiger 14.0 CB 128MB K6-2 450 MHz, 2632
>DpFritz K6450     17-17    Junior6 K6450   19.5-19.5  SOS  K6-2 450    9.5-3.5
>CM8000 K6-450   17.5-14.5  Goliath K6450   27.5-12.5  Nimzo99 K6450    7.5-1.5
>Fritz532 P200     31-11    Hiar732 P200X     38-13    Junior5 P200X      6-2
>Rebel 8 P200X   32.5-7.5
>
> 16 Chessmaster 8000  128MB K6-2 450 MHz, 2502
>DpFritz K6450    7.5-32.5  CT14 CB K6450   14.5-17.5  Junior5 K6450   13.5-26.5
>SOS  K6-2 450     24-16    Nimzo99 K6450   15.5-15.5  Shred 2 P200X      5-3
>
>
>The most common email-question about the SSDF rating
>list, has been about the absence of any Chessmaster-
>version on K6-2 450 MHz. And the answer has always been:
>"Chessmaster can not be played automatically, and none
>of the testers are nowadays willing to play manual games."
>
>From now on I expect that the above mentioned question will
>cease to arrive! Thanks to a Winboard adapter from Eberhard
>Börger, it's now possible to play automatically with
>Chessmaster 8000, although only one game at a time. For
>unknown reasons it doesn't work for all of the testers, but
>at least for a couple of us.
>
>After 191 tournament games Chessmaster 8000 K6-2 450 MHz has
>received a rating of 2502. As CM6000 on P200 MMX has 2473, the
>present result is clearly a disappointment. Even with no
>change of the chess engine, you would have expected about
>fifty more points.

I disagree here.  I would have guessed 50-70 ELO increase, and within the bounds
of uncertainty, the actual result was *well* within expectations.

>Due to relatively few games played with
>the two Chessmaster versions, statistical reasons could be
>a partial explanation. Another possibility is that CM8000 is
>a weaker chess program than CM6000.

That's a popular theory that is being offered by various persons.  Are you
jumping on the bandwagon?  I see no statistical evidence for this whatsoever.
When your error bars justify such a claim, it might be worth investigating.  To
be sure of how the Chessmaster engine scales to CPU power, you could also run
the CM6000 engine on 450 MHz machines.  I have seen for a certainty that all
engines do not scale in the same way.  For instance, the free engine Amy does
better and better at longer and longer time controls and on faster and faster
hardware.  Other engines do not behave in this same way.

In short (if anything) the evidence REJECTS that hypothesis.

>The most fascinating thing about testing chess programs, is when
>you can conclude that a new program is better than it's
>predecessor. As could be seen above, this is not always the case.
>Therefore I'm glad to say that Christophe Théron has managed
>to clearly increase the playing strength of his program during
>the latest 18 months!
>
>The Rebel-version of Gambit Tiger 2.0 K6-2 450 MHz has a rating
>of 2650 after 302 games!! And Chess Tiger 14.0 K6-2 450 MHz
>(sold by ChessBase) got 2632 after 308 games! Compared to CT 12.0
>the rating increase is 74 and 56 points! GT2 is only three points
>behind the leading Deep Fritz. In average the two Tiger-versions
>are twelve points behind. It will be interesting to see what
>happens when more games are played.
>
>The fight between Fritz and Tiger will continue, now also on
>a higher level! The reason is that SSDF has started to test on
>a new hardware, Athlon 1200 MHz with 256MB RAM! After the summer
>we expect to present the first results with 5 or 6 of the strongest
>programs.

A lot of people have been hoping for this.  I like it also, because the quality
of the chess will improve.

>New programs which we expect to test in the near future are
>Shredder 5.32, Junior 7.0 and Gandalf 4.32h. If Hiarcs 8.0 arrives
>during summer, it will also be included on the next list, which
>will appear in late August or early September.
>
>Thoralf Karlsson



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.