Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF ratings are 100% accurate

Author: Len Eisner

Date: 10:22:30 12/12/99

Go up one level in this thread


On December 12, 1999 at 08:49:08, Albert Silver wrote:

>Hi all,
>
>As the issue of SSDF ratings, and their comparative value with USCF or FIDE
>ratings, has been a recurring theme and a number of threads have sprouted
>recently, I thought I'd share my opinion (self-plagiarized) as I think it is
>relevant and might shed some light on the matter.
>
>SSDF ratings: inflated or not?
>Here's what I think: the ratings are not inflated in the least bit.
>Sounds crazy doesn't it? But it's not. People get too caught up trying to make
>these futile comparisons between SSDF ratings and human ratings whether USCF,
>FIDE, or whatever. The point is, and it has been repeated very often, there
>simply is no comparison. The only comparison possible is that both are generated
>using Elo's rating system, but that's where it ends. Elo's system is supposed to
>calculate, according to a point system, the probability of success between
>opponents rated in that system. The SSDF rating list does that to perfection,
>but it is based on the members of the SSDF only. If you put Fritz 5.32 on fast
>hardware up against the Tasc R30 or whatnot, it will pulverize the machine. The
>difference in SSDF ratings accurately depicts that. It has NOTHING to do with
>FIDE or USCF ratings. The rating of Fritz, Hiarcs, or others on the SSDF rating
>list depicts their probability of success against other programs on the SSDF
>list, and that's it. It doesn't represent their probability of success against
>humans because humans simply aren't a part of the testing. If you want to find
>out how a program will do against humans then test it against humans, and then
>you will find it's rating against them. The SSDF rating has nothing whatsoever
>to do with that. As was pointed out, I believe the SSDF ratings pool is a pool
>that is COMPLETELY isolated from all others and as such cannot possibly be
>compared with them.
>
>                                    Albert Silver

I understand the SSDF list accurately reflects the results of comp vs. comp
testing – numbers don’t lie.  But up until now, I (and most other people)
assumed you could at least apply the relative SSDF rating differences to people.
 In other words, if program A is rated 2600 and program B is rated 2650, I
assumed program B would play 50 points stronger against me.  They may have been
2400 and 2450 respectively in FIDE terms, but I assumed the 50-point difference
was correct.

Now I am beginning to see that SSDF ratings do not reflect performance against
humans – period.  Going back to my example, program B could actually be weaker
than program A against GMs, even though it is 50 points stronger in SSDF comp
vs. com testing.

I guess this is what Ed Schroder has been saying all along about Rebel.  I need
to think about this for a while.

Len



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.