Author: Bertil Eklund
Date: 02:34:55 06/14/01
Go up one level in this thread
On June 14, 2001 at 02:49:18, Martin Schubert wrote: >On June 13, 2001 at 18:32:45, Bertil Eklund wrote: > >>On June 13, 2001 at 17:56:08, James T. Walker wrote: >> >>>On June 13, 2001 at 16:14:33, Christophe Theron wrote: >>> >>>>On June 13, 2001 at 11:20:20, James T. Walker wrote: >>>> >>>>>On June 13, 2001 at 00:01:19, Christophe Theron wrote: >>>>> >>>>>>On June 12, 2001 at 22:50:01, James T. Walker wrote: >>>>>> >>>>>>>On June 12, 2001 at 20:54:16, stuart taylor wrote: >>>>>>> >>>>>>>>On June 12, 2001 at 18:41:58, Christophe Theron wrote: >>>>>>>> >>>>>>>>>On June 12, 2001 at 14:48:10, Thoralf Karlsson wrote: >>>>>>>>> >>>>>>>>>> THE SSDF RATING LIST 2001-06-11 79042 games played by 219 computers >>>>>>>>>> Rating + - Games Won Oppo >>>>>>>>>> ------ --- --- ----- --- ---- >>>>>>>>>> 1 Deep Fritz 128MB K6-2 450 MHz 2653 29 -28 647 64% 2551 >>>>>>>>>> 2 Gambit Tiger 2.0 128MB K6-2 450 MHz 2650 43 -40 302 67% 2528 >>>>>>>>>> 3 Chess Tiger 14.0 CB 128MB K6-2 450 MHz 2632 43 -40 308 67% 2508 >>>>>>>>>> 4 Fritz 6.0 128MB K6-2 450 MHz 2623 23 -23 968 64% 2520 >>>>>>>>>> 5 Junior 6.0 128MB K6-2 450 MHz 2596 20 -20 1230 62% 2509 >>>>>>>>>> 6 Chess Tiger 12.0 DOS 128MB K6-2 450 MHz 2576 26 -26 733 61% 2499 >>>>>>>>>> 7 Fritz 5.32 128MB K6-2 450 MHz 2551 25 -25 804 58% 2496 >>>>>>>>>> 8 Nimzo 7.32 128MB K6-2 450 MHz 2550 24 -23 897 58% 2491 >>>>>>>>>> 9 Nimzo 8.0 128MB K6-2 450 MHz 2542 28 -28 612 54% 2511 >>>>>>>>>> 10 Junior 5.0 128MB K6-2 450 MHz 2534 25 -25 790 58% 2478 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>Congratulations to Frans Morsch and Mathias Feist (and the ChessBase team). >>>>>>>>> >>>>>>>>>Deep Fritz is definitely a very tough client. You cannot lead the SSDF list by >>>>>>>>>accident, and leading it for so many years in a row is probably the best >>>>>>>>>achievement of a chess program of all times. >>>>>>>>> >>>>>>>>>If you want to sum up the history of chess programs for microcomputers, I think >>>>>>>>>you just need to remember 3 names: >>>>>>>>>* Richard Lang >>>>>>>>>* Frans Morsch and Mathias Feist >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Christophe >>>>>>>> >>>>>>>>The roarng absence of the name Christophe, appears of course, in the signature >>>>>>>>of the post. >>>>>>>>But I have a little question. Does Deep Fritz have any advantage in the testing >>>>>>>>e.g. the fact that it already stood at the top, long before the recent GT even >>>>>>>>arrived on the scene, and so may have had an advantageous starting point? >>>>>>>>S.Taylor >>>>>>> >>>>>>>Hello Stuart, >>>>>>>I believe that is a valid question. I would like to know the answer. I would >>>>>>>like to know if the SSDF "Zeros out" the book learning of say Deep Fritz before >>>>>>>starting a match with Gambit Tiger when Gambit Tiger is brand new? I still >>>>>>>think the SSDF list is quesionable because of the differences in opponents each >>>>>>>program has to face. I'm sure it's better than nothing but I sure wouldn't like >>>>>>>to hang my hat on a 3 point difference in SSDF ratings (or even 20 points for >>>>>>>that matter). >>>>>>>Jim >>>>>> >>>>>> >>>>>> >>>>>>I don't question the reliability of the list. >>>>>> >>>>>>It is the most reliable tool that we have to evaluate the chess programs. The >>>>>>difference in the opponents each program has to face does not matter from a >>>>>>mathematical point of view. >>>>>> >>>>>>Year after year we can see that the list is reliable. Almost all objections get >>>>>>refuted, little by little. Of course it is not absolutely perfect, but I think >>>>>>it's damn good. >>>>>> >>>>>> >>>>>> >>>>>> Christophe >>>>> >>>>>Hello Christophe, >>>>>I think the thread got sidetracked but I disagree with your assessment of the >>>>>SSDF list. I agree it's not perfect and it's pretty good but.... I think its >>>>>too easy to make one program come out on top by selecting the number of games >>>>>played vs certain opponents. If you could play only one opponent and get a true >>>>>rating then there would be no problem. We all know this is not the case. Some >>>>>programs do better against certain opponents and worse vs others. So if you >>>>>play more games vs the opponent you do best against it will inflate your rating. >>>>> Of course the opposite is true. So if Program "A" plays its favorite opponent >>>>>while program "B" plays it "nemesis" more games then naturally program "A" will >>>>>look better even though they may be equal or even the opposite is true. This >>>>>becomes very critical when the difference in rating is only a few points in >>>>>reality. I'm not saying the SSDF does this on purpose but I'm sure they are >>>>>doing nothing to compensate for this possibility. In my opinion the best way to >>>>>do the SSDF list would be to make all top programs play an equal number of games >>>>>against the same opponents. That way the top programs would all play the same >>>>>number of games against the same opponents and the list would look like this: >>>>> >>>>>Name Rating Number of games >>>>>Program A 2600 400 >>>>>Program B 2590 400 >>>>>Program C 2580 400 >>>> >>>> >>>> >>>>I cannot think of any real evidence that such a phenomenon exist. Can you >>>>mention amongst the top programs which program gets killed by what other >>>>program? >>>> >>>>Has someone statistical evidence of this? >>>> >>>>But anyway, even if all program meet each other, I know some people will say >>>>that there is another way to bias the results: by letting a given program to >>>>enter or not to enter the list you have an influence on the programs it is >>>>supposed to kill. >>>> >>>>It's a neverending story. >>>> >>>> >>>> >>>> Christophe >>> >>> >>>Hello Christophe, >>>You don't have to get killed or be a killer to change the rating by a few >>>points. The first program that comes to mind is ChessMaster. I believe that >>>playing a "Learning" program vs a non-learning program will add rating points to >>>the learning program with more and more games played between them. If this is >>>not the case then you could just play 500 games vs any opponent you chose and >>>your rating would be just as accurate. In any case this "bias" could be avoided >>>with a little planning. >>>Jim >> >>Ok, and what is wrong now, that favours program x or y? >> >>Bertil > >I doubt that the list favours a program. But I think your idea is to play 40 >games in a match, so I wonder why not play exactly 40 games. Sometimes you play >more, sometimes you play less. I don't think it's a big problem playing 39 or 42 >games. But it should be no problem playing the same number. Why I would prefer >this is the statistics. The best thing for getting a good statistics for ratings >would be playing a tournament like Cadaques: every program against each other >the same number of games. > >Regards, Martin Hi! Usually we tries to play 40 game matches but from the last list some matches are not finished (11/06 01). In the match Tiger against DF 17-17 or so, Tony received the new Athlon parts and of course he upgraded as soon as he received them! In some case the match could be shorter because of hard or software problems. Bertil
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.