Author: Robert Hyatt
Date: 07:00:07 12/22/99
Go up one level in this thread
On December 22, 1999 at 08:16:55, Graham Laight wrote: >On December 21, 1999 at 23:16:18, Robert Hyatt wrote: > >>On December 21, 1999 at 21:23:57, Graham Laight wrote: >> >>>On December 21, 1999 at 18:10:05, Fernando Villegas wrote: >>> >>>>On December 21, 1999 at 13:15:11, Graham Laight wrote: >>>> >>>>>I apologise for bringing up a subject which has undoubtedly already been >>>>>discussed, but according to the SSDF ratings, Chess Tiger is 2696. >>>>> >>>>>According to the FIDE ratings, there are only 11 players in the world with a >>>>>higher rating than this. >>>>> >>>>>Can this possibly be correct? >>>>> >>>>>Graham >>>> >>>>As it has been said before, Elo rating between computers are valid in he >>>>communuty of computers and has a not very clear and perhaps definitively dark >>>>relation with Elo of human players. In fact, there is not any known method to >>>>determinate that relation, until now. Only guesses. If monkeys played chess, >>>>they too would have an elo rating, but I am sure you would not equate the elo of >>>>Sheeta with that of Gary. Sorry for the monkeys >>>>Fernando >>> >>>If monkeys could play chess, their Elo rating would be very low - so they would >>>be comparable to Gary. Monkey Elo would probably be about 100, Gary's is over >>>2800. >>> >> >> >>not true. If the monkeys _only_ play other monkeys, some could easily have >>ratings over 2800. Of course that would have nothing to do with FIDE >>ratings... >> >> >>>Following the link on Albert Silver's post to the previous discussion, it >>>appears that Albert (and others) are saying the same thing - that because you're >>>not comparing like with like, the computer Elo ratings are not valid. >> >>No, that isn't what he said. He said that computer (SSDF) ratings are >>perfectly valid. But they have _nothing_ to do with FIDE ratings. Other >>than both are 4 digit base ten numbers. The monkey rating would have nothing >>to do with either of these either, unless the monkeys played in the same pool >>of players with one of the two groups (FIDE players or SSDF-tested computer >>programs). > >SSDF believe that there is sufficient play between the pool of humans and the >pool of computers to be able to say that their ratings are valid with respect to >Swedish (and hence Fide) human Elo ratings. > >At the risk of being labelled "naughty", I have copied the following paragraphs >directly from the FAQ page on the SSDF web site: > >*** Start Of Quoted Material*** > >Q: How are the ratings calculated? > >A: SSDF uses its own rating program, written by our member Lars Hjorth, but the >basic formulas are derived from Arpad Elo's ELO rating system. Our program >calculates, for each computer, the average rating for its opponents and how many >points it has scored. Given those two numbers, professor Elo's formulas produces >a rating. > >However, if all computers are only tested against other computers, all we get is >a relative rating that is just valid among those computers. Therefore, SSDF has >played several hundred games between computers and human players in serious >tournaments and used these results to set a "correct" absolute level for the >rating list according to Swedish conditions. Different national rating systems >are not completely in accordance though, and that has to be taken into account >when reading our list. For instance, US ratings seems to lie approximately 50 >points above the corresponding Swedish ratings (maybe more when below 2000 and >less on the other side of the scale). For ourselves we obviously use the Swedish >scale. That ought to be removed from their faq. That was done in the 1980's, and has _not_ been done since. Do you think there is still any vestiges of those old human ratings left, since for the past 10+ years they have _only_ included computer vs computer games in their ratings? For this to be valid as a comparison between SSDF and FIDE ratings, they would have to _regularly_ play several of the programs in FIDE events and rate them in their 'pool' by using the results of those games + the comp/comp games. They don't do this. > >We firmly believe that our ratings are correct in the sense that if a computer >were to play a sufficient number of games against Swedish humans, it would end >up with a rating close to what it has on our list. Unfortunately, as programs >get better it becomes increasingly difficult to arrange meaningful games against >human players. Reassuringly, we've noted that our ratings are fairly consistent >with the results from the yearly Aegon tournament in Holland. Baloney nowadays. No program would consistently play at near 2700 at aegon. > >*** End Of Quoted Material *** > >>> >>>I have yet to be convinced, I'm afraid. Firstly, on their web site, SSDF say >>>they have done some research to ensure that their rating ranges are reasonably >>>accurate. In the past, for example, they have used the Aegon tournament to check >>>the validity of their rating ranges. >>> >>>Secondly, much of the argument revolved around the idea that computers are prone >>>to making moves which are weak from the positional perspective - and that only 1 >>>such weak move is needed to lose a game with a grandmaster. However, I would >>>question this for the following reasons: >>> >>>* Computers have a remarkably good ability to survive the resulting "crushing" >>>attacks. Sometimes, when they find an escape, they are able to go on and win the >>>game >>> >>>* IMs and above tend to divide themselves into "active" players (e.g. Maurice >>>Ashley) and "positional" players (e.g. Yasser Sierewan, Anatoly Karpov). >>>Certainly players like Yasser were, in the past, able to beat computers (Yasser >>>is a previous winner of Aegon). But players like Kasparov (who tends to lose to >>>computers) must have all (or most) of the positional players' knowledge, because >>>his Elo rating is so much higher than theirs. >> >> >>Kasparov doesn't tend to lose to computers, excepting one match vs DB. > >He lost a G25 match against Genius in London in 1994 - and Kasparov had the >white pieces! Worse still, the computer, from memory, was only a Pentium 90 - >not even a Pentium II! > >Gary's problem seems to be his style: he is a brilliant tactician, who is not >usually frightened of seeing a complicated malaise in from of him. But Deeper >Blue (and the other computers) have taught him to be afraid - very afraid - of >tactics when he's up against a machine. > >>> >>>To organise another Aegon style tournament would probably cost about $120,000 >>>and it's entirely possible that, because IBM have basically milked much of the >>>publicity available for human v computer chess, that sponsorship would be very >>>difficult to obtain. So, for the time being, we're stuck with jumping on every >>>little scrap of information to try to create a (moving) picture of what the >>>reality of the ratings is like. >>> >>>Graham >> >> >>Ed is providing some reasonable data, although it is taking time to get enough >>games to draw conclusions. But at least in another year or so we will have >>some vague notion about the FIDE rating Rebel might be playing at. > >Then SSDF will have to rate his program. At the moment, he's asked for his >programs not to be rated by SSDF. Ed seems to have a fear of competition. He has >long since stopped entering computer tournaments. He's not afraid to compete >against top players because, at the moment, he's the only one doing it. > >Graham I don't believe he has a fear of competing. I think he had a fear of abuse, as it seems that some autoplayer issues were surfacing that indicated a bit of foul play was possible...
This page took 0.02 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.