Author: Robert Hyatt
Date: 06:33:26 10/12/99
Go up one level in this thread
On October 12, 1999 at 08:49:25, Leon Stancliff wrote: >On October 11, 1999 at 20:40:27, Robert Hyatt wrote: > >>On October 11, 1999 at 17:12:56, Leon Stancliff wrote: >> >>>On October 11, 1999 at 13:46:39, Dann Corbit wrote: >>> >>>>James Robertson already correctly answered your post, but I thought I would add >>>>a bit of fluff in the way of explanation. >>>> >>>>Imagine a room full of blithering idiots. Perhaps 10,000 of them. We have them >>>>all play chess against each other for a few years. The top idiot will have an >>>>ELO of perhaps 2500 or better. Now, it may be that you 5 year old sister can >>>>beat his pants off. After all, he's a blithering idiot. >>>> >>>>The point is that ELO calculations are relative to the competition. So ELO >>>>calculations are *only* relevant to the pool of players that they compete with. >>>>Now, take the SSDF list. The ELO figures defintely show relative strength. But >>>>that does not necessarily show how they will play against humans. For instance, >>>>one of the top five programs might have a systematic flaw, which when discovered >>>>will allow a human to always beat it. The computer programs may never try to >>>>exploit this flaw and so the ELO within the SSDF pool remains constant. But if >>>>humans discover the flaw, they will exploit it. >>>> >>>>Now, how can we correspond the SSDF ratings with human ratings? Really, we >>>>can't. It may be that there is a direct correspondence of some sort. It may be >>>>that the computers are actually stronger or weaker than humans with the same >>>>ratings. >>>> >>>>In any case, it is true that nearly all modern computer programs are formidable >>>>opponents. >>> >>>It is true, however, that after Rebel 10 has played as many as ten games against >>>a variety of grandmasters, we have an approximate anchor for estimations. If we >>>know Rebel 10 can play at 2525 Elo against ten different grandmasters, we can >>>compare the relative ratings of SSDF and adjust all of them the same amount as >>>the difference between Rebel's SSDF and its actual performance against humans. >> >> >>that doesn't come close to being statistically sound. IE take any one >>player in your local club, and let him go off and get a USCF rating. Then >>let him come back and play your club members one at a time for 10 game >>matches, and use those results to rate each player. Do you think they would >>be accurate? Of course not... >> >>the rating system doesn't work like that. >> >> >> >> >>>Yes, any one computer program may have a flaw which will allow humans to exploit >>>that flaw, but I think the time when humans can just summarily dismiss the >>>rating lists is quickly drawing to a close. >> >>I never 'summarily' dismiss the 'list'. Because the distance between any two >>programs on that list is directly proportional to the difference in strength >>between the programs. But I definitely ignore the absolute value of the >>ratings, since the SSDF numbers are (IMHO) significantly inflated over what >>they would be were the programs to compete in (say) FIDE events... >> >> >> >> >> >>> To earn a grandmaster title the human must have three results within a one >>>year period that are 2550 or above. Rebel 10 may well accomplish that feat. We >>>are informed that a human who can generally play at 2500 can very likely >>>qualify. It is my personal opinion that if Rebel can play at or above the 2500 >>>rating over an extended length of time, it should be recognized as being of >>>grandmaster strength. Truly the SSDF ratings are inflated. We will now have a >>>method for determining just how much! >> >> >>GM ratings are a bit harder to earn than your description implies. To earn a >>GM title, a player has to do two things. (1) play in 3 events and earn 'norms' >>in each... the points required per tournament vary according to the number of >>players and the average rating. (2) maintain a rating over 2500 for the course >>of earning the 3 norms. Not easy. And a 2500+ TPR over the 3 tournaments >>won't do it... the norms really require a 2600+ TPR due to the way the >>calculations are done... _very_ difficult. > >This discussion is very interesting. Alexander Stripunsky just received the >Grandmaster title. His Elo is listed at 2492. To have played at 2600+ for his >three norms, he would have had to play over 100 points above his elo in three >tournaments within a period of one year... Very difficult! true... but you really wouldn't trust a rating that was 'barely' GM over the three events... the margin of error is too high. FIDE went with the approach that was more certain. If the margin of error is 25, and you require a rating at least 50 over the GM 'floor', you are fairly safe in concluding the player is safely playing over the gm boundary. > >If I have counted correctly, there are 39 active male Grandmasters in the U.S.A. >Some 30 of these have Elo ratings of 2500+. The average is about 2550. Obviously >several of these Grandmasters are in the declining years. However, the majority >of them fall in the 2500-2600 category. > >We are not talking about ratings which are accurate to the nearest elo point. I >am confident that Rebel's results against the number of different Grandmaster >opponents that would be faced by a human in three tournaments (Perhaps 15) the >earned elo rating would be accurate within a margin of 25 points. > >By the way, Bob, Do you know how I can obtain the numbers used to determine the >Grandmaster norm for any one specific tournament? I might add that I appreciate >you heavy involvement in the CCC. Although I do not always agree with your >conclusions, I do feel your contribution to computer chess is invaluable. FIDE's web site used to have this. I haven't looked there in a long while, and they had some serious web site problems earlier this year, so I am not sure it is still there. I believe it is in the FIDE rules of chess as well, but my copy of that grew legs and walked away over the years. :) Note that a GM title isn't awarded for 15 games, period. It requires three separate events, with a minimum number of rounds each, and with the number of points required calculated based on the average rating of all the entrants (ie you see 'category 13' events and so forth). Matches and single games don't count, so we have to be a little generous in this regard.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.