Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Comments of latest SSDF list 2

Author: Rolf Tueschen

Date: 05:13:08 05/26/02

Go up one level in this thread


On May 26, 2002 at 05:08:43, Tina Long wrote:

>As long as the number of opponents and number of games is large enough, then the
>ratings are as valid as if the programs had played the same opponents.  The
>"other" opponents have valid ratings, so the results against "leading" opponents
>are equally valid.  Not forgetting of course the degree of accuracy - the +-.
>

I would not support this. Many aspects are flawed. What is large enough? You
won't think that 40 is large enough?! Then your wording "equally valid" is
unacceptable. I know what you mean, but if you make a testing design you must
look after equality not during argumentation afterwards. It's simply not sound
the other way round. So I agree with Martin Schubert.

Schubert:
>>My suggestion: the top programms should play the same opponents to make it
>>possible to compare their results.
>
>This would give more interesting results tables, but theoretically the ratings
>would be no more accurate than the current ratings.
>This would also have the benefit of excluding results where top programs beat
>poorer programs by say 35-5.  But again, would theoretically not give more
>accurate ratings.

I don't see your point. What is "accurate"? What do you expect after 40 games
max.?

>Remember too that SSDF has a limited number of testers, a limited number of
>computers, and a limited number of copies of programs.  I assume they test in
>the way they feel is best for their limited resources & time. They have been
>doing these tests for around 20 years, and are pretty compitant at what they're
>doing.

This is not the point. Like you I thought that SSDF had a bunch of amateur
testers all over Sweden. But this is false. The SSDF has very very few testers
only left. This would be one of my proposals for a reformation of SSDF:

°° the open declaration of the testers; I was informed by a real insider that
some testers don't even collect their game scores (!)


>
>Every list they publish causes all sorts of speculation regarding the accuracy
>of their results and the correctness of their methodoligy.  It is impossible for
>them to test Exactly correctly, and it is more impossible for them to please all
>the people all the time.
>
>I like to take their lists as given, and I always take a good look at the +-.
>
>Regards,
>Tina

This is unacceptable. You are confusing the main aspects. It is _not_ the point
that they "could not" test correctly. Of course they could. The accuracy of the
results has nothing to do with correctness. In modern times it's no longer
accepted that institutions can do what they want because they just are
"existing". I hope that the SSDF is not of your opinion. They could change some
practices and bingo they make a correct testing. The accuracy is a statistical
problem of course.

Rolf Tueschen



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.