Author: Albert Silver
Date: 18:40:41 12/11/99
Go up one level in this thread
On December 11, 1999 at 13:12:01, Roger wrote:
>
>>
>>So you propose tossing the matches it played against certain opponents? How
>>would you choose which matches to discard?
>>
>> Albert Silver
>
>I don't propose tossing anything. I am just pointing out that the stability of
>Fritz rating as asserted by Enrique is somewhat of an illusion. That doesn't
>mean his position is false, simply that that single fact, cited to support his
>position, might or might not support him if the ratings for Fritz were
>calculated against the rest of the SSDF pool in segments of 250 games (I pick
>250 as an arbitrary large number).
>
>You MIGHT then see a substantial rating decline, and just eyeballing the Fritz
>numbers supplied by Enrique supports this idea. You would have to do the
>calculations, of course, to see how extensive this decline was.
>
>But...even if the ratings were shown to decline over time, that doesn't
>necessary make the SSDF ratings flawed. As we all know, GM players book up
>against each other and study each other's games for flaws, and any GM player
>that doesn't do so will see their rating decline.
I'm not sure what the reference to booking up against opponents has to do with
this, but here's what I think: the ratings are not inflated in the least bit.
Sounds crazy doesn't it? But it's not. People get too caught up trying to make
these futile comparisons between SSDF ratings and human ratings whether USCF,
FIDE, or whatever. The point is, and it has been repeated very often, there
simply is no comparison. The only comparison possible is that both are generated
using Elo's rating system, but that's where it ends. Elo's system is supposed to
calculate, according to a point system, the probability of success between
opponents rated in that system. The SSDF rating list does that to perfection,
but it is based on the members of the SSDF only. If you put Fritz 5.32 on fast
hardware up against the Tasc R30 or whatnot, it will pulverize the machine. The
difference in SSDF ratings accurately depicts that. It has NOTHING to do with
FIDE or USCF ratings. The rating of Fritz, Hiarcs, or others on the SSDF rating
list depicts their probability of success against other programs on the SSDF
list, and that's it. It doesn't represent their probability of success against
humans because humans aren't a part of the testing. If you want to find out how
a program will do against humans then test it against humans, and then you will
find it's rating against them. The SSDF rating has nothing whatsoever to do with
that.
Albert Silver
>
>Roger
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.