Author: Rolf Tueschen
Date: 05:13:08 05/26/02
Go up one level in this thread
On May 26, 2002 at 05:08:43, Tina Long wrote: >As long as the number of opponents and number of games is large enough, then the >ratings are as valid as if the programs had played the same opponents. The >"other" opponents have valid ratings, so the results against "leading" opponents >are equally valid. Not forgetting of course the degree of accuracy - the +-. > I would not support this. Many aspects are flawed. What is large enough? You won't think that 40 is large enough?! Then your wording "equally valid" is unacceptable. I know what you mean, but if you make a testing design you must look after equality not during argumentation afterwards. It's simply not sound the other way round. So I agree with Martin Schubert. Schubert: >>My suggestion: the top programms should play the same opponents to make it >>possible to compare their results. > >This would give more interesting results tables, but theoretically the ratings >would be no more accurate than the current ratings. >This would also have the benefit of excluding results where top programs beat >poorer programs by say 35-5. But again, would theoretically not give more >accurate ratings. I don't see your point. What is "accurate"? What do you expect after 40 games max.? >Remember too that SSDF has a limited number of testers, a limited number of >computers, and a limited number of copies of programs. I assume they test in >the way they feel is best for their limited resources & time. They have been >doing these tests for around 20 years, and are pretty compitant at what they're >doing. This is not the point. Like you I thought that SSDF had a bunch of amateur testers all over Sweden. But this is false. The SSDF has very very few testers only left. This would be one of my proposals for a reformation of SSDF: °° the open declaration of the testers; I was informed by a real insider that some testers don't even collect their game scores (!) > >Every list they publish causes all sorts of speculation regarding the accuracy >of their results and the correctness of their methodoligy. It is impossible for >them to test Exactly correctly, and it is more impossible for them to please all >the people all the time. > >I like to take their lists as given, and I always take a good look at the +-. > >Regards, >Tina This is unacceptable. You are confusing the main aspects. It is _not_ the point that they "could not" test correctly. Of course they could. The accuracy of the results has nothing to do with correctness. In modern times it's no longer accepted that institutions can do what they want because they just are "existing". I hope that the SSDF is not of your opinion. They could change some practices and bingo they make a correct testing. The accuracy is a statistical problem of course. Rolf Tueschen
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.