Author: James T. Walker
Date: 14:56:08 06/13/01
Go up one level in this thread
On June 13, 2001 at 16:14:33, Christophe Theron wrote: >On June 13, 2001 at 11:20:20, James T. Walker wrote: > >>On June 13, 2001 at 00:01:19, Christophe Theron wrote: >> >>>On June 12, 2001 at 22:50:01, James T. Walker wrote: >>> >>>>On June 12, 2001 at 20:54:16, stuart taylor wrote: >>>> >>>>>On June 12, 2001 at 18:41:58, Christophe Theron wrote: >>>>> >>>>>>On June 12, 2001 at 14:48:10, Thoralf Karlsson wrote: >>>>>> >>>>>>> THE SSDF RATING LIST 2001-06-11 79042 games played by 219 computers >>>>>>> Rating + - Games Won Oppo >>>>>>> ------ --- --- ----- --- ---- >>>>>>> 1 Deep Fritz 128MB K6-2 450 MHz 2653 29 -28 647 64% 2551 >>>>>>> 2 Gambit Tiger 2.0 128MB K6-2 450 MHz 2650 43 -40 302 67% 2528 >>>>>>> 3 Chess Tiger 14.0 CB 128MB K6-2 450 MHz 2632 43 -40 308 67% 2508 >>>>>>> 4 Fritz 6.0 128MB K6-2 450 MHz 2623 23 -23 968 64% 2520 >>>>>>> 5 Junior 6.0 128MB K6-2 450 MHz 2596 20 -20 1230 62% 2509 >>>>>>> 6 Chess Tiger 12.0 DOS 128MB K6-2 450 MHz 2576 26 -26 733 61% 2499 >>>>>>> 7 Fritz 5.32 128MB K6-2 450 MHz 2551 25 -25 804 58% 2496 >>>>>>> 8 Nimzo 7.32 128MB K6-2 450 MHz 2550 24 -23 897 58% 2491 >>>>>>> 9 Nimzo 8.0 128MB K6-2 450 MHz 2542 28 -28 612 54% 2511 >>>>>>> 10 Junior 5.0 128MB K6-2 450 MHz 2534 25 -25 790 58% 2478 >>>>>> >>>>>> >>>>>> >>>>>>Congratulations to Frans Morsch and Mathias Feist (and the ChessBase team). >>>>>> >>>>>>Deep Fritz is definitely a very tough client. You cannot lead the SSDF list by >>>>>>accident, and leading it for so many years in a row is probably the best >>>>>>achievement of a chess program of all times. >>>>>> >>>>>>If you want to sum up the history of chess programs for microcomputers, I think >>>>>>you just need to remember 3 names: >>>>>>* Richard Lang >>>>>>* Frans Morsch and Mathias Feist >>>>>> >>>>>> >>>>>> >>>>>> Christophe >>>>> >>>>>The roarng absence of the name Christophe, appears of course, in the signature >>>>>of the post. >>>>>But I have a little question. Does Deep Fritz have any advantage in the testing >>>>>e.g. the fact that it already stood at the top, long before the recent GT even >>>>>arrived on the scene, and so may have had an advantageous starting point? >>>>>S.Taylor >>>> >>>>Hello Stuart, >>>>I believe that is a valid question. I would like to know the answer. I would >>>>like to know if the SSDF "Zeros out" the book learning of say Deep Fritz before >>>>starting a match with Gambit Tiger when Gambit Tiger is brand new? I still >>>>think the SSDF list is quesionable because of the differences in opponents each >>>>program has to face. I'm sure it's better than nothing but I sure wouldn't like >>>>to hang my hat on a 3 point difference in SSDF ratings (or even 20 points for >>>>that matter). >>>>Jim >>> >>> >>> >>>I don't question the reliability of the list. >>> >>>It is the most reliable tool that we have to evaluate the chess programs. The >>>difference in the opponents each program has to face does not matter from a >>>mathematical point of view. >>> >>>Year after year we can see that the list is reliable. Almost all objections get >>>refuted, little by little. Of course it is not absolutely perfect, but I think >>>it's damn good. >>> >>> >>> >>> Christophe >> >>Hello Christophe, >>I think the thread got sidetracked but I disagree with your assessment of the >>SSDF list. I agree it's not perfect and it's pretty good but.... I think its >>too easy to make one program come out on top by selecting the number of games >>played vs certain opponents. If you could play only one opponent and get a true >>rating then there would be no problem. We all know this is not the case. Some >>programs do better against certain opponents and worse vs others. So if you >>play more games vs the opponent you do best against it will inflate your rating. >> Of course the opposite is true. So if Program "A" plays its favorite opponent >>while program "B" plays it "nemesis" more games then naturally program "A" will >>look better even though they may be equal or even the opposite is true. This >>becomes very critical when the difference in rating is only a few points in >>reality. I'm not saying the SSDF does this on purpose but I'm sure they are >>doing nothing to compensate for this possibility. In my opinion the best way to >>do the SSDF list would be to make all top programs play an equal number of games >>against the same opponents. That way the top programs would all play the same >>number of games against the same opponents and the list would look like this: >> >>Name Rating Number of games >>Program A 2600 400 >>Program B 2590 400 >>Program C 2580 400 > > > >I cannot think of any real evidence that such a phenomenon exist. Can you >mention amongst the top programs which program gets killed by what other >program? > >Has someone statistical evidence of this? > >But anyway, even if all program meet each other, I know some people will say >that there is another way to bias the results: by letting a given program to >enter or not to enter the list you have an influence on the programs it is >supposed to kill. > >It's a neverending story. > > > > Christophe Hello Christophe, You don't have to get killed or be a killer to change the rating by a few points. The first program that comes to mind is ChessMaster. I believe that playing a "Learning" program vs a non-learning program will add rating points to the learning program with more and more games played between them. If this is not the case then you could just play 500 games vs any opponent you chose and your rating would be just as accurate. In any case this "bias" could be avoided with a little planning. Jim
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.