Author: Robert Hyatt
Date: 17:40:27 10/11/99
Go up one level in this thread
On October 11, 1999 at 17:12:56, Leon Stancliff wrote: >On October 11, 1999 at 13:46:39, Dann Corbit wrote: > >>James Robertson already correctly answered your post, but I thought I would add >>a bit of fluff in the way of explanation. >> >>Imagine a room full of blithering idiots. Perhaps 10,000 of them. We have them >>all play chess against each other for a few years. The top idiot will have an >>ELO of perhaps 2500 or better. Now, it may be that you 5 year old sister can >>beat his pants off. After all, he's a blithering idiot. >> >>The point is that ELO calculations are relative to the competition. So ELO >>calculations are *only* relevant to the pool of players that they compete with. >>Now, take the SSDF list. The ELO figures defintely show relative strength. But >>that does not necessarily show how they will play against humans. For instance, >>one of the top five programs might have a systematic flaw, which when discovered >>will allow a human to always beat it. The computer programs may never try to >>exploit this flaw and so the ELO within the SSDF pool remains constant. But if >>humans discover the flaw, they will exploit it. >> >>Now, how can we correspond the SSDF ratings with human ratings? Really, we >>can't. It may be that there is a direct correspondence of some sort. It may be >>that the computers are actually stronger or weaker than humans with the same >>ratings. >> >>In any case, it is true that nearly all modern computer programs are formidable >>opponents. > >It is true, however, that after Rebel 10 has played as many as ten games against >a variety of grandmasters, we have an approximate anchor for estimations. If we >know Rebel 10 can play at 2525 Elo against ten different grandmasters, we can >compare the relative ratings of SSDF and adjust all of them the same amount as >the difference between Rebel's SSDF and its actual performance against humans. that doesn't come close to being statistically sound. IE take any one player in your local club, and let him go off and get a USCF rating. Then let him come back and play your club members one at a time for 10 game matches, and use those results to rate each player. Do you think they would be accurate? Of course not... the rating system doesn't work like that. >Yes, any one computer program may have a flaw which will allow humans to exploit >that flaw, but I think the time when humans can just summarily dismiss the >rating lists is quickly drawing to a close. I never 'summarily' dismiss the 'list'. Because the distance between any two programs on that list is directly proportional to the difference in strength between the programs. But I definitely ignore the absolute value of the ratings, since the SSDF numbers are (IMHO) significantly inflated over what they would be were the programs to compete in (say) FIDE events... > To earn a grandmaster title the human must have three results within a one >year period that are 2550 or above. Rebel 10 may well accomplish that feat. We >are informed that a human who can generally play at 2500 can very likely >qualify. It is my personal opinion that if Rebel can play at or above the 2500 >rating over an extended length of time, it should be recognized as being of >grandmaster strength. Truly the SSDF ratings are inflated. We will now have a >method for determining just how much! GM ratings are a bit harder to earn than your description implies. To earn a GM title, a player has to do two things. (1) play in 3 events and earn 'norms' in each... the points required per tournament vary according to the number of players and the average rating. (2) maintain a rating over 2500 for the course of earning the 3 norms. Not easy. And a 2500+ TPR over the 3 tournaments won't do it... the norms really require a 2600+ TPR due to the way the calculations are done... _very_ difficult.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.