Author: Bertil Eklund
Date: 04:53:26 12/13/99
Go up one level in this thread
On December 13, 1999 at 06:29:21, Albert Silver wrote: >On December 12, 1999 at 17:18:46, Roger wrote: > >>Yep, the SSDF is a pool unto itself, and as such, its ratings can't be compared >>to those of humans. >> >>The problem in saying that the ratings are 100% ACCURATE is: Accurate compared >>to what? If the ratings are what they are, that is, if the pool is 100% >>isolated, then the statement is tautological: The SSDF ratings are accurate >>compared to themselves. Not very exciting. > >Not sure what you mean. > >> >>So...my opinion is that statements about the accuracy of the ratings must refer >>to some external source of validation, in other words, some reference point >>outside the pool itself. > >Why? > >> >>And that, of course, would be human ratings. > >I don't understand how adding games against humans will make a rating system >that calculates how computers do against other computers more precise. If >anything it will make it valueless. > >> >>So, as more IM and GM versus computer games emerge, the SSDF ratings can >>eventually be recalibrated > >Recalibrated? You are assuming they are supposed to be connected. SSDF ratings >calculate computer versus computer ratings. If you change the pool, you change >what they are calculating, not making it more precise. Where is it imprecise? Hi! Of course they are (was) connected, once upon a time (1993) the list was connected with about 300 games against humans. The level of the list is still based on those games. Bertil SSDF > Albert Silver > >> so that ELO differences between humans and programs >>ARE meaningful. It will simply take time for a pool of games to emerge. Then the >>whole matter can be handled with the rigor of statistical methods. >> >>Roger >> >> >> >> >>On December 12, 1999 at 08:49:08, Albert Silver wrote: >> >>>Hi all, >>> >>>As the issue of SSDF ratings, and their comparative value with USCF or FIDE >>>ratings, has been a recurring theme and a number of threads have sprouted >>>recently, I thought I'd share my opinion (self-plagiarized) as I think it is >>>relevant and might shed some light on the matter. >>> >>>SSDF ratings: inflated or not? >>>Here's what I think: the ratings are not inflated in the least bit. >>>Sounds crazy doesn't it? But it's not. People get too caught up trying to make >>>these futile comparisons between SSDF ratings and human ratings whether USCF, >>>FIDE, or whatever. The point is, and it has been repeated very often, there >>>simply is no comparison. The only comparison possible is that both are generated >>>using Elo's rating system, but that's where it ends. Elo's system is supposed to >>>calculate, according to a point system, the probability of success between >>>opponents rated in that system. The SSDF rating list does that to perfection, >>>but it is based on the members of the SSDF only. If you put Fritz 5.32 on fast >>>hardware up against the Tasc R30 or whatnot, it will pulverize the machine. The >>>difference in SSDF ratings accurately depicts that. It has NOTHING to do with >>>FIDE or USCF ratings. The rating of Fritz, Hiarcs, or others on the SSDF rating >>>list depicts their probability of success against other programs on the SSDF >>>list, and that's it. It doesn't represent their probability of success against >>>humans because humans simply aren't a part of the testing. If you want to find >>>out how a program will do against humans then test it against humans, and then >>>you will find it's rating against them. The SSDF rating has nothing whatsoever >>>to do with that. As was pointed out, I believe the SSDF ratings pool is a pool >>>that is COMPLETELY isolated from all others and as such cannot possibly be >>>compared with them. >>> >>> Albert Silver
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.