Author: Leon Stancliff
Date: 05:49:25 10/12/99
Go up one level in this thread
On October 11, 1999 at 20:40:27, Robert Hyatt wrote: >On October 11, 1999 at 17:12:56, Leon Stancliff wrote: > >>On October 11, 1999 at 13:46:39, Dann Corbit wrote: >> >>>James Robertson already correctly answered your post, but I thought I would add >>>a bit of fluff in the way of explanation. >>> >>>Imagine a room full of blithering idiots. Perhaps 10,000 of them. We have them >>>all play chess against each other for a few years. The top idiot will have an >>>ELO of perhaps 2500 or better. Now, it may be that you 5 year old sister can >>>beat his pants off. After all, he's a blithering idiot. >>> >>>The point is that ELO calculations are relative to the competition. So ELO >>>calculations are *only* relevant to the pool of players that they compete with. >>>Now, take the SSDF list. The ELO figures defintely show relative strength. But >>>that does not necessarily show how they will play against humans. For instance, >>>one of the top five programs might have a systematic flaw, which when discovered >>>will allow a human to always beat it. The computer programs may never try to >>>exploit this flaw and so the ELO within the SSDF pool remains constant. But if >>>humans discover the flaw, they will exploit it. >>> >>>Now, how can we correspond the SSDF ratings with human ratings? Really, we >>>can't. It may be that there is a direct correspondence of some sort. It may be >>>that the computers are actually stronger or weaker than humans with the same >>>ratings. >>> >>>In any case, it is true that nearly all modern computer programs are formidable >>>opponents. >> >>It is true, however, that after Rebel 10 has played as many as ten games against >>a variety of grandmasters, we have an approximate anchor for estimations. If we >>know Rebel 10 can play at 2525 Elo against ten different grandmasters, we can >>compare the relative ratings of SSDF and adjust all of them the same amount as >>the difference between Rebel's SSDF and its actual performance against humans. > > >that doesn't come close to being statistically sound. IE take any one >player in your local club, and let him go off and get a USCF rating. Then >let him come back and play your club members one at a time for 10 game >matches, and use those results to rate each player. Do you think they would >be accurate? Of course not... > >the rating system doesn't work like that. > > > > >>Yes, any one computer program may have a flaw which will allow humans to exploit >>that flaw, but I think the time when humans can just summarily dismiss the >>rating lists is quickly drawing to a close. > >I never 'summarily' dismiss the 'list'. Because the distance between any two >programs on that list is directly proportional to the difference in strength >between the programs. But I definitely ignore the absolute value of the >ratings, since the SSDF numbers are (IMHO) significantly inflated over what >they would be were the programs to compete in (say) FIDE events... > > > > > >> To earn a grandmaster title the human must have three results within a one >>year period that are 2550 or above. Rebel 10 may well accomplish that feat. We >>are informed that a human who can generally play at 2500 can very likely >>qualify. It is my personal opinion that if Rebel can play at or above the 2500 >>rating over an extended length of time, it should be recognized as being of >>grandmaster strength. Truly the SSDF ratings are inflated. We will now have a >>method for determining just how much! > > >GM ratings are a bit harder to earn than your description implies. To earn a >GM title, a player has to do two things. (1) play in 3 events and earn 'norms' >in each... the points required per tournament vary according to the number of >players and the average rating. (2) maintain a rating over 2500 for the course >of earning the 3 norms. Not easy. And a 2500+ TPR over the 3 tournaments >won't do it... the norms really require a 2600+ TPR due to the way the >calculations are done... _very_ difficult. This discussion is very interesting. Alexander Stripunsky just received the Grandmaster title. His Elo is listed at 2492. To have played at 2600+ for his three norms, he would have had to play over 100 points above his elo in three tournaments within a period of one year... Very difficult! If I have counted correctly, there are 39 active male Grandmasters in the U.S.A. Some 30 of these have Elo ratings of 2500+. The average is about 2550. Obviously several of these Grandmasters are in the declining years. However, the majority of them fall in the 2500-2600 category. We are not talking about ratings which are accurate to the nearest elo point. I am confident that Rebel's results against the number of different Grandmaster opponents that would be faced by a human in three tournaments (Perhaps 15) the earned elo rating would be accurate within a margin of 25 points. By the way, Bob, Do you know how I can obtain the numbers used to determine the Grandmaster norm for any one specific tournament? I might add that I appreciate you heavy involvement in the CCC. Although I do not always agree with your conclusions, I do feel your contribution to computer chess is invaluable.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.