Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: SSDF Rating list

Author: Bertil Eklund
Date: 02:34:55 06/14/01
On June 14, 2001 at 02:49:18, Martin Schubert wrote:

>On June 13, 2001 at 18:32:45, Bertil Eklund wrote:
>
>>On June 13, 2001 at 17:56:08, James T. Walker wrote:
>>
>>>On June 13, 2001 at 16:14:33, Christophe Theron wrote:
>>>
>>>>On June 13, 2001 at 11:20:20, James T. Walker wrote:
>>>>
>>>>>On June 13, 2001 at 00:01:19, Christophe Theron wrote:
>>>>>
>>>>>>On June 12, 2001 at 22:50:01, James T. Walker wrote:
>>>>>>
>>>>>>>On June 12, 2001 at 20:54:16, stuart taylor wrote:
>>>>>>>
>>>>>>>>On June 12, 2001 at 18:41:58, Christophe Theron wrote:
>>>>>>>>
>>>>>>>>>On June 12, 2001 at 14:48:10, Thoralf Karlsson wrote:
>>>>>>>>>
>>>>>>>>>>  THE SSDF RATING LIST 2001-06-11   79042 games played by  219 computers
>>>>>>>>>>                                           Rating   +     -  Games   Won  Oppo
>>>>>>>>>>                                           ------  ---   --- -----   ---  ----
>>>>>>>>>>   1 Deep Fritz  128MB K6-2 450 MHz          2653   29   -28   647   64%  2551
>>>>>>>>>>   2 Gambit Tiger 2.0  128MB K6-2 450 MHz    2650   43   -40   302   67%  2528
>>>>>>>>>>   3 Chess Tiger 14.0 CB 128MB K6-2 450 MHz  2632   43   -40   308   67%  2508
>>>>>>>>>>   4 Fritz 6.0  128MB K6-2 450 MHz           2623   23   -23   968   64%  2520
>>>>>>>>>>   5 Junior 6.0  128MB K6-2 450 MHz          2596   20   -20  1230   62%  2509
>>>>>>>>>>   6 Chess Tiger 12.0 DOS 128MB K6-2 450 MHz 2576   26   -26   733   61%  2499
>>>>>>>>>>   7 Fritz 5.32  128MB K6-2 450 MHz          2551   25   -25   804   58%  2496
>>>>>>>>>>   8 Nimzo 7.32  128MB K6-2 450 MHz          2550   24   -23   897   58%  2491
>>>>>>>>>>   9 Nimzo 8.0  128MB K6-2 450 MHz           2542   28   -28   612   54%  2511
>>>>>>>>>>  10 Junior 5.0  128MB K6-2 450 MHz          2534   25   -25   790   58%  2478
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Congratulations to Frans Morsch and Mathias Feist (and the ChessBase team).
>>>>>>>>>
>>>>>>>>>Deep Fritz is definitely a very tough client. You cannot lead the SSDF list by
>>>>>>>>>accident, and leading it for so many years in a row is probably the best
>>>>>>>>>achievement of a chess program of all times.
>>>>>>>>>
>>>>>>>>>If you want to sum up the history of chess programs for microcomputers, I think
>>>>>>>>>you just need to remember 3 names:
>>>>>>>>>* Richard Lang
>>>>>>>>>* Frans Morsch and Mathias Feist
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    Christophe
>>>>>>>>
>>>>>>>>The roarng absence of the name Christophe, appears of course, in the signature
>>>>>>>>of the post.
>>>>>>>>But I have a little question. Does Deep Fritz have any advantage in the testing
>>>>>>>>e.g. the fact that it already stood at the top, long before the recent GT even
>>>>>>>>arrived on the scene, and so may have had an advantageous starting point?
>>>>>>>>S.Taylor
>>>>>>>
>>>>>>>Hello Stuart,
>>>>>>>I believe that is a valid question.  I would like to know the answer.  I would
>>>>>>>like to know if the SSDF "Zeros out" the book learning of say Deep Fritz before
>>>>>>>starting a match with Gambit Tiger when Gambit Tiger is brand new?  I still
>>>>>>>think the SSDF list is quesionable because of the differences in opponents each
>>>>>>>program has to face.  I'm sure it's better than nothing but I sure wouldn't like
>>>>>>>to hang my hat on a 3 point difference in SSDF ratings (or even 20 points for
>>>>>>>that matter).
>>>>>>>Jim
>>>>>>
>>>>>>
>>>>>>
>>>>>>I don't question the reliability of the list.
>>>>>>
>>>>>>It is the most reliable tool that we have to evaluate the chess programs. The
>>>>>>difference in the opponents each program has to face does not matter from a
>>>>>>mathematical point of view.
>>>>>>
>>>>>>Year after year we can see that the list is reliable. Almost all objections get
>>>>>>refuted, little by little. Of course it is not absolutely perfect, but I think
>>>>>>it's damn good.
>>>>>>
>>>>>>
>>>>>>
>>>>>>    Christophe
>>>>>
>>>>>Hello Christophe,
>>>>>I think the thread got sidetracked but I disagree with your assessment of the
>>>>>SSDF list.  I agree it's not perfect and it's pretty good but....   I think its
>>>>>too easy to make one program come out on top by selecting the number of games
>>>>>played vs certain opponents.  If you could play only one opponent and get a true
>>>>>rating then there would be no problem.  We all know this is not the case.  Some
>>>>>programs do better against certain opponents and worse vs others.  So if you
>>>>>play more games vs the opponent you do best against it will inflate your rating.
>>>>> Of course the opposite is true.  So if Program "A" plays its favorite opponent
>>>>>while program "B" plays it "nemesis" more games then naturally program "A" will
>>>>>look better even though they may be equal or even the opposite is true.  This
>>>>>becomes very critical when the difference in rating is only a few points in
>>>>>reality.  I'm not saying the SSDF does this on purpose but I'm sure they are
>>>>>doing nothing to compensate for this possibility.  In my opinion the best way to
>>>>>do the SSDF list would be to make all top programs play an equal number of games
>>>>>against the same opponents.  That way the top programs would all play the same
>>>>>number of games against the same opponents and the list would look like this:
>>>>>
>>>>>Name         Rating      Number of games
>>>>>Program A    2600        400
>>>>>Program B    2590        400
>>>>>Program C    2580        400
>>>>
>>>>
>>>>
>>>>I cannot think of any real evidence that such a phenomenon exist. Can you
>>>>mention amongst the top programs which program gets killed by what other
>>>>program?
>>>>
>>>>Has someone statistical evidence of this?
>>>>
>>>>But anyway, even if all program meet each other, I know some people will say
>>>>that there is another way to bias the results: by letting a given program to
>>>>enter or not to enter the list you have an influence on the programs it is
>>>>supposed to kill.
>>>>
>>>>It's a neverending story.
>>>>
>>>>
>>>>
>>>>    Christophe
>>>
>>>
>>>Hello Christophe,
>>>You don't have to get killed or be a killer to change the rating by a few
>>>points.  The first program that comes to mind is ChessMaster.  I believe that
>>>playing a "Learning" program vs a non-learning program will add rating points to
>>>the learning program with more and more games played between them.  If this is
>>>not the case then you could just play 500 games vs any opponent you chose and
>>>your rating would be just as accurate. In any case this "bias" could be avoided
>>>with a little planning.
>>>Jim
>>
>>Ok, and what is wrong now, that favours program x or y?
>>
>>Bertil
>
>I doubt that the list favours a program. But I think your idea is to play 40
>games in a match, so I wonder why not play exactly 40 games. Sometimes you play
>more, sometimes you play less. I don't think it's a big problem playing 39 or 42
>games. But it should be no problem playing the same number. Why I would prefer
>this is the statistics. The best thing for getting a good statistics for ratings
>would be playing a tournament like Cadaques: every program against each other
>the same number of games.
>
>Regards, Martin

Hi!

Usually we tries to play 40 game matches but from the last list some matches are
not finished (11/06 01). In the match Tiger against DF 17-17 or so, Tony
received the new Athlon parts and of course he upgraded as soon as he received
them! In some case the match could be shorter because of hard or software
problems.

Bertil
Re: SSDF Rating list Chessfun 11:21:04 06/15/01
- Re: SSDF Rating list Bertil Eklund 14:52:54 06/15/01
  - Re: SSDF Rating list Chessfun 15:35:18 06/15/01
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.