Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF ratings are 100% accurate

Author: Roger

Date: 14:18:46 12/12/99

Go up one level in this thread


Yep, the SSDF is a pool unto itself, and as such, its ratings can't be compared
to those of humans.

The problem in saying that the ratings are 100% ACCURATE is: Accurate compared
to what? If the ratings are what they are, that is, if the pool is 100%
isolated, then the statement is tautological: The SSDF ratings are accurate
compared to themselves. Not very exciting.

So...my opinion is that statements about the accuracy of the ratings must refer
to some external source of validation, in other words, some reference point
outside the pool itself.

And that, of course, would be human ratings.

So, as more IM and GM versus computer games emerge, the SSDF ratings can
eventually be recalibrated so that ELO differences between humans and programs
ARE meaningful. It will simply take time for a pool of games to emerge. Then the
whole matter can be handled with the rigor of statistical methods.

Roger




On December 12, 1999 at 08:49:08, Albert Silver wrote:

>Hi all,
>
>As the issue of SSDF ratings, and their comparative value with USCF or FIDE
>ratings, has been a recurring theme and a number of threads have sprouted
>recently, I thought I'd share my opinion (self-plagiarized) as I think it is
>relevant and might shed some light on the matter.
>
>SSDF ratings: inflated or not?
>Here's what I think: the ratings are not inflated in the least bit.
>Sounds crazy doesn't it? But it's not. People get too caught up trying to make
>these futile comparisons between SSDF ratings and human ratings whether USCF,
>FIDE, or whatever. The point is, and it has been repeated very often, there
>simply is no comparison. The only comparison possible is that both are generated
>using Elo's rating system, but that's where it ends. Elo's system is supposed to
>calculate, according to a point system, the probability of success between
>opponents rated in that system. The SSDF rating list does that to perfection,
>but it is based on the members of the SSDF only. If you put Fritz 5.32 on fast
>hardware up against the Tasc R30 or whatnot, it will pulverize the machine. The
>difference in SSDF ratings accurately depicts that. It has NOTHING to do with
>FIDE or USCF ratings. The rating of Fritz, Hiarcs, or others on the SSDF rating
>list depicts their probability of success against other programs on the SSDF
>list, and that's it. It doesn't represent their probability of success against
>humans because humans simply aren't a part of the testing. If you want to find
>out how a program will do against humans then test it against humans, and then
>you will find it's rating against them. The SSDF rating has nothing whatsoever
>to do with that. As was pointed out, I believe the SSDF ratings pool is a pool
>that is COMPLETELY isolated from all others and as such cannot possibly be
>compared with them.
>
>                                    Albert Silver



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.