Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF ratings are 100% accurate

Author: blass uri

Date: 23:34:49 12/12/99

Go up one level in this thread


On December 12, 1999 at 13:22:30, Len Eisner wrote:

>On December 12, 1999 at 08:49:08, Albert Silver wrote:
>
>>Hi all,
>>
>>As the issue of SSDF ratings, and their comparative value with USCF or FIDE
>>ratings, has been a recurring theme and a number of threads have sprouted
>>recently, I thought I'd share my opinion (self-plagiarized) as I think it is
>>relevant and might shed some light on the matter.
>>
>>SSDF ratings: inflated or not?
>>Here's what I think: the ratings are not inflated in the least bit.
>>Sounds crazy doesn't it? But it's not. People get too caught up trying to make
>>these futile comparisons between SSDF ratings and human ratings whether USCF,
>>FIDE, or whatever. The point is, and it has been repeated very often, there
>>simply is no comparison. The only comparison possible is that both are generated
>>using Elo's rating system, but that's where it ends. Elo's system is supposed to
>>calculate, according to a point system, the probability of success between
>>opponents rated in that system. The SSDF rating list does that to perfection,
>>but it is based on the members of the SSDF only. If you put Fritz 5.32 on fast
>>hardware up against the Tasc R30 or whatnot, it will pulverize the machine. The
>>difference in SSDF ratings accurately depicts that. It has NOTHING to do with
>>FIDE or USCF ratings. The rating of Fritz, Hiarcs, or others on the SSDF rating
>>list depicts their probability of success against other programs on the SSDF
>>list, and that's it. It doesn't represent their probability of success against
>>humans because humans simply aren't a part of the testing. If you want to find
>>out how a program will do against humans then test it against humans, and then
>>you will find it's rating against them. The SSDF rating has nothing whatsoever
>>to do with that. As was pointed out, I believe the SSDF ratings pool is a pool
>>that is COMPLETELY isolated from all others and as such cannot possibly be
>>compared with them.
>>
>>                                    Albert Silver
>
>I understand the SSDF list accurately reflects the results of comp vs. comp
>testing – numbers don’t lie.  But up until now, I (and most other people)
>assumed you could at least apply the relative SSDF rating differences to people.
> In other words, if program A is rated 2600 and program B is rated 2650, I
>assumed program B would play 50 points stronger against me.  They may have been
>2400 and 2450 respectively in FIDE terms, but I assumed the 50-point difference
>was correct.
>
>Now I am beginning to see that SSDF ratings do not reflect performance against
>humans – period.  Going back to my example, program B could actually be weaker
>than program A against GMs, even though it is 50 points stronger in SSDF comp
>vs. com testing.

We have no data to see if this is the case.

I think that the difference is inflated 2650 ssdf-2600 ssdf<50 fide human
rating.

I am also interested if the analysis is better and not in the performance
against humans.

It is not important for most of the customers if there is one weakness that
humans can win the computer because of it when this weakness is not relevant
for analysis of 99% of their games.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.