Author: Peter Fendrich
Date: 16:57:52 01/30/01
Go up one level in this thread
On January 30, 2001 at 09:50:37, David Wilke wrote: >On January 30, 2001 at 09:14:48, Hans Christian Lykke wrote: > >>On January 30, 2001 at 09:06:09, Jorge Pichard wrote: >> >>>Ever since I matched Nimzo 8 vs Junior 6 using my AMD K6-2 500 MHz and also >>>matched them using my Athlon 800 MHz at G\60 and got different scores; some >>>people argued that those games were not statistically significants to proof >>>anything at all. Then we must disregard the SSDF rating list, since each Chess >>>program only play 40 games against each other and not 200 games. >> >>I think that you have misunderstood how SSDF works. >> >>ex.: Which rating is the most reliable: >> >>1. 400 games played with 200 games against 2 others from the SSDF-list >>2. 400 games played with 40 games against 10 others from the SSDF-list >> >>I´m sure it´s nr. 2. > >If it is number 2, then that really only shows the possibility of an overall >rating, not a rating vs one program. It could also not show a true rating, as >the program faces programs on weaker hardware. There is no "rating vs one program". The very idea with rating is to find out the mean strength within the population you have chosen. You have no different ratings against different opponents. Of cource you can have different expected outcomes against different opponents with the same rating but that's another story. This "weaker hardware" or "different hardware" argument is an old one and a misunderstanding. Each combination of program and hardware is regarded as one individual with it's own results and ratings. If some combinations are weaker than others that's just life. If the differences are too huge however, the rating system itself is getting inaccurate but that's not an issue in the SSDF tests. //Peter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.