Author: Chessfun
Date: 11:21:04 06/15/01
Go up one level in this thread
On June 14, 2001 at 05:34:55, Bertil Eklund wrote: >On June 14, 2001 at 02:49:18, Martin Schubert wrote: > >>On June 13, 2001 at 18:32:45, Bertil Eklund wrote: >> >>>On June 13, 2001 at 17:56:08, James T. Walker wrote: >>> >>>>On June 13, 2001 at 16:14:33, Christophe Theron wrote: >>>> >>>>>On June 13, 2001 at 11:20:20, James T. Walker wrote: >>>>> >>>>>>On June 13, 2001 at 00:01:19, Christophe Theron wrote: >>>>>> >>>>>>>On June 12, 2001 at 22:50:01, James T. Walker wrote: >>>>>>> >>>>>>>>On June 12, 2001 at 20:54:16, stuart taylor wrote: >>>>>>>> >>>>>>>>>On June 12, 2001 at 18:41:58, Christophe Theron wrote: >>>>>>>>> >>>>>>>>>>On June 12, 2001 at 14:48:10, Thoralf Karlsson wrote: >>>>>>>>>> >>>>>>>>>>> THE SSDF RATING LIST 2001-06-11 79042 games played by 219 computers >>>>>>>>>>> Rating + - Games Won Oppo >>>>>>>>>>> ------ --- --- ----- --- ---- >>>>>>>>>>> 1 Deep Fritz 128MB K6-2 450 MHz 2653 29 -28 647 64% 2551 >>>>>>>>>>> 2 Gambit Tiger 2.0 128MB K6-2 450 MHz 2650 43 -40 302 67% 2528 >>>>>>>>>>> 3 Chess Tiger 14.0 CB 128MB K6-2 450 MHz 2632 43 -40 308 67% 2508 >>>>>>>>>>> 4 Fritz 6.0 128MB K6-2 450 MHz 2623 23 -23 968 64% 2520 >>>>>>>>>>> 5 Junior 6.0 128MB K6-2 450 MHz 2596 20 -20 1230 62% 2509 >>>>>>>>>>> 6 Chess Tiger 12.0 DOS 128MB K6-2 450 MHz 2576 26 -26 733 61% 2499 >>>>>>>>>>> 7 Fritz 5.32 128MB K6-2 450 MHz 2551 25 -25 804 58% 2496 >>>>>>>>>>> 8 Nimzo 7.32 128MB K6-2 450 MHz 2550 24 -23 897 58% 2491 >>>>>>>>>>> 9 Nimzo 8.0 128MB K6-2 450 MHz 2542 28 -28 612 54% 2511 >>>>>>>>>>> 10 Junior 5.0 128MB K6-2 450 MHz 2534 25 -25 790 58% 2478 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>Congratulations to Frans Morsch and Mathias Feist (and the ChessBase team). >>>>>>>>>> >>>>>>>>>>Deep Fritz is definitely a very tough client. You cannot lead the SSDF list by >>>>>>>>>>accident, and leading it for so many years in a row is probably the best >>>>>>>>>>achievement of a chess program of all times. >>>>>>>>>> >>>>>>>>>>If you want to sum up the history of chess programs for microcomputers, I think >>>>>>>>>>you just need to remember 3 names: >>>>>>>>>>* Richard Lang >>>>>>>>>>* Frans Morsch and Mathias Feist >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Christophe >>>>>>>>> >>>>>>>>>The roarng absence of the name Christophe, appears of course, in the signature >>>>>>>>>of the post. >>>>>>>>>But I have a little question. Does Deep Fritz have any advantage in the testing >>>>>>>>>e.g. the fact that it already stood at the top, long before the recent GT even >>>>>>>>>arrived on the scene, and so may have had an advantageous starting point? >>>>>>>>>S.Taylor >>>>>>>> >>>>>>>>Hello Stuart, >>>>>>>>I believe that is a valid question. I would like to know the answer. I would >>>>>>>>like to know if the SSDF "Zeros out" the book learning of say Deep Fritz before >>>>>>>>starting a match with Gambit Tiger when Gambit Tiger is brand new? I still >>>>>>>>think the SSDF list is quesionable because of the differences in opponents each >>>>>>>>program has to face. I'm sure it's better than nothing but I sure wouldn't like >>>>>>>>to hang my hat on a 3 point difference in SSDF ratings (or even 20 points for >>>>>>>>that matter). >>>>>>>>Jim >>>>>>> >>>>>>> >>>>>>> >>>>>>>I don't question the reliability of the list. >>>>>>> >>>>>>>It is the most reliable tool that we have to evaluate the chess programs. The >>>>>>>difference in the opponents each program has to face does not matter from a >>>>>>>mathematical point of view. >>>>>>> >>>>>>>Year after year we can see that the list is reliable. Almost all objections get >>>>>>>refuted, little by little. Of course it is not absolutely perfect, but I think >>>>>>>it's damn good. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Christophe >>>>>> >>>>>>Hello Christophe, >>>>>>I think the thread got sidetracked but I disagree with your assessment of the >>>>>>SSDF list. I agree it's not perfect and it's pretty good but.... I think its >>>>>>too easy to make one program come out on top by selecting the number of games >>>>>>played vs certain opponents. If you could play only one opponent and get a true >>>>>>rating then there would be no problem. We all know this is not the case. Some >>>>>>programs do better against certain opponents and worse vs others. So if you >>>>>>play more games vs the opponent you do best against it will inflate your rating. >>>>>> Of course the opposite is true. So if Program "A" plays its favorite opponent >>>>>>while program "B" plays it "nemesis" more games then naturally program "A" will >>>>>>look better even though they may be equal or even the opposite is true. This >>>>>>becomes very critical when the difference in rating is only a few points in >>>>>>reality. I'm not saying the SSDF does this on purpose but I'm sure they are >>>>>>doing nothing to compensate for this possibility. In my opinion the best way to >>>>>>do the SSDF list would be to make all top programs play an equal number of games >>>>>>against the same opponents. That way the top programs would all play the same >>>>>>number of games against the same opponents and the list would look like this: >>>>>> >>>>>>Name Rating Number of games >>>>>>Program A 2600 400 >>>>>>Program B 2590 400 >>>>>>Program C 2580 400 >>>>> >>>>> >>>>> >>>>>I cannot think of any real evidence that such a phenomenon exist. Can you >>>>>mention amongst the top programs which program gets killed by what other >>>>>program? >>>>> >>>>>Has someone statistical evidence of this? >>>>> >>>>>But anyway, even if all program meet each other, I know some people will say >>>>>that there is another way to bias the results: by letting a given program to >>>>>enter or not to enter the list you have an influence on the programs it is >>>>>supposed to kill. >>>>> >>>>>It's a neverending story. >>>>> >>>>> >>>>> >>>>> Christophe >>>> >>>> >>>>Hello Christophe, >>>>You don't have to get killed or be a killer to change the rating by a few >>>>points. The first program that comes to mind is ChessMaster. I believe that >>>>playing a "Learning" program vs a non-learning program will add rating points to >>>>the learning program with more and more games played between them. If this is >>>>not the case then you could just play 500 games vs any opponent you chose and >>>>your rating would be just as accurate. In any case this "bias" could be avoided >>>>with a little planning. >>>>Jim >>> >>>Ok, and what is wrong now, that favours program x or y? >>> >>>Bertil >> >>I doubt that the list favours a program. But I think your idea is to play 40 >>games in a match, so I wonder why not play exactly 40 games. Sometimes you play >>more, sometimes you play less. I don't think it's a big problem playing 39 or 42 >>games. But it should be no problem playing the same number. Why I would prefer >>this is the statistics. The best thing for getting a good statistics for ratings >>would be playing a tournament like Cadaques: every program against each other >>the same number of games. >> >>Regards, Martin > >Hi! > >Usually we tries to play 40 game matches but from the last list some matches are >not finished (11/06 01). In the match Tiger against DF 17-17 or so, Tony >received the new Athlon parts and of course he upgraded as soon as he received >them! In some case the match could be shorter because of hard or software >problems. > >Bertil Personally I agree with Christophe the SSDF is the most reliable tool that is available for rating a program. Some can critique either method or a specific but nothing else compares. It is as near perfect IMO as testing a commercial program can be. Anyone who bothers to spend time looking through the games knows it. Sarah.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.