Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF ratings are 100% accurate

Author: Albert Silver

Date: 03:29:21 12/13/99

Go up one level in this thread


On December 12, 1999 at 17:18:46, Roger wrote:

>Yep, the SSDF is a pool unto itself, and as such, its ratings can't be compared
>to those of humans.
>
>The problem in saying that the ratings are 100% ACCURATE is: Accurate compared
>to what? If the ratings are what they are, that is, if the pool is 100%
>isolated, then the statement is tautological: The SSDF ratings are accurate
>compared to themselves. Not very exciting.

Not sure what you mean.

>
>So...my opinion is that statements about the accuracy of the ratings must refer
>to some external source of validation, in other words, some reference point
>outside the pool itself.

Why?

>
>And that, of course, would be human ratings.

I don't understand how adding games against humans will make a rating system
that calculates how computers do against other computers more precise. If
anything it will make it valueless.

>
>So, as more IM and GM versus computer games emerge, the SSDF ratings can
>eventually be recalibrated

Recalibrated? You are assuming they are supposed to be connected. SSDF ratings
calculate computer versus computer ratings. If you change the pool, you change
what they are calculating, not making it more precise. Where is it imprecise?

                                  Albert Silver

> so that ELO differences between humans and programs
>ARE meaningful. It will simply take time for a pool of games to emerge. Then the
>whole matter can be handled with the rigor of statistical methods.
>
>Roger
>
>
>
>
>On December 12, 1999 at 08:49:08, Albert Silver wrote:
>
>>Hi all,
>>
>>As the issue of SSDF ratings, and their comparative value with USCF or FIDE
>>ratings, has been a recurring theme and a number of threads have sprouted
>>recently, I thought I'd share my opinion (self-plagiarized) as I think it is
>>relevant and might shed some light on the matter.
>>
>>SSDF ratings: inflated or not?
>>Here's what I think: the ratings are not inflated in the least bit.
>>Sounds crazy doesn't it? But it's not. People get too caught up trying to make
>>these futile comparisons between SSDF ratings and human ratings whether USCF,
>>FIDE, or whatever. The point is, and it has been repeated very often, there
>>simply is no comparison. The only comparison possible is that both are generated
>>using Elo's rating system, but that's where it ends. Elo's system is supposed to
>>calculate, according to a point system, the probability of success between
>>opponents rated in that system. The SSDF rating list does that to perfection,
>>but it is based on the members of the SSDF only. If you put Fritz 5.32 on fast
>>hardware up against the Tasc R30 or whatnot, it will pulverize the machine. The
>>difference in SSDF ratings accurately depicts that. It has NOTHING to do with
>>FIDE or USCF ratings. The rating of Fritz, Hiarcs, or others on the SSDF rating
>>list depicts their probability of success against other programs on the SSDF
>>list, and that's it. It doesn't represent their probability of success against
>>humans because humans simply aren't a part of the testing. If you want to find
>>out how a program will do against humans then test it against humans, and then
>>you will find it's rating against them. The SSDF rating has nothing whatsoever
>>to do with that. As was pointed out, I believe the SSDF ratings pool is a pool
>>that is COMPLETELY isolated from all others and as such cannot possibly be
>>compared with them.
>>
>>                                    Albert Silver



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.