Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Chess Tiger - Is It Really 2696 ELO?

Author: Albert Silver

Date: 06:44:56 12/22/99

Go up one level in this thread


On December 22, 1999 at 08:16:55, Graham Laight wrote:

>On December 21, 1999 at 23:16:18, Robert Hyatt wrote:
>
>>On December 21, 1999 at 21:23:57, Graham Laight wrote:
>>
>>>On December 21, 1999 at 18:10:05, Fernando Villegas wrote:
>>>
>>>>On December 21, 1999 at 13:15:11, Graham Laight wrote:
>>>>
>>>>>I apologise for bringing up a subject which has undoubtedly already been
>>>>>discussed, but according to the SSDF ratings, Chess Tiger is 2696.
>>>>>
>>>>>According to the FIDE ratings, there are only 11 players in the world with a
>>>>>higher rating than this.
>>>>>
>>>>>Can this possibly be correct?
>>>>>
>>>>>Graham
>>>>
>>>>As it has been said before, Elo rating between computers are valid in he
>>>>communuty of computers and has a not very clear and perhaps definitively dark
>>>>relation with Elo of human players. In fact, there is not any known method to
>>>>determinate that relation, until now. Only guesses. If monkeys played chess,
>>>>they too would have an elo rating, but I am sure you would not equate the elo of
>>>>Sheeta with that of Gary. Sorry for the monkeys
>>>>Fernando
>>>
>>>If monkeys could play chess, their Elo rating would be very low - so they would
>>>be comparable to Gary. Monkey Elo would probably be about 100, Gary's is over
>>>2800.
>>>
>>
>>
>>not true.  If the monkeys _only_ play other monkeys, some could easily have
>>ratings over 2800.  Of course that would have nothing to do with FIDE
>>ratings...
>>
>>
>>>Following the link on Albert Silver's post to the previous discussion, it
>>>appears that Albert (and others) are saying the same thing - that because you're
>>>not comparing like with like, the computer Elo ratings are not valid.
>>
>>No, that isn't what he said. He said that computer (SSDF) ratings are
>>perfectly valid.  But they have _nothing_ to do with FIDE ratings.  Other
>>than both are 4 digit base ten numbers.  The monkey rating would have nothing
>>to do with either of these either, unless the monkeys played in the same pool
>>of players with one of the two groups (FIDE players or SSDF-tested computer
>>programs).
>
>SSDF believe that there is sufficient play between the pool of humans and the
>pool of computers to be able to say that their ratings are valid with respect to
>Swedish (and hence Fide) human Elo ratings.

When was it written? The last time the SSDF included games played against humans
in their list was in 1993 if I'm not mistaken. That's more than 6 years ago.

>
>At the risk of being labelled "naughty", I have copied the following paragraphs
>directly from the FAQ page on the SSDF web site:
>
>*** Start Of Quoted Material***
>
>Q: How are the ratings calculated?
>
>A: SSDF uses its own rating program, written by our member Lars Hjorth, but the
>basic formulas are derived from Arpad Elo's ELO rating system. Our program
>calculates, for each computer, the average rating for its opponents and how many
>points it has scored. Given those two numbers, professor Elo's formulas produces
>a rating.
>
>However, if all computers are only tested against other computers, all we get is
>a relative rating that is just valid among those computers. Therefore, SSDF has
>played several hundred games between computers and human players in serious
>tournaments and used these results to set a "correct" absolute level for the
>rating list according to Swedish conditions. Different national rating systems
>are not completely in accordance though, and that has to be taken into account
>when reading our list. For instance, US ratings seems to lie approximately 50
>points above the corresponding Swedish ratings (maybe more when below 2000 and
>less on the other side of the scale). For ourselves we obviously use the Swedish
>scale.
>
>We firmly believe that our ratings are correct in the sense that if a computer
>were to play a sufficient number of games against Swedish humans, it would end
>up with a rating close to what it has on our list. Unfortunately, as programs
>get better it becomes increasingly difficult to arrange meaningful games against
>human players. Reassuringly, we've noted that our ratings are fairly consistent
>with the results from the yearly Aegon tournament in Holland.
>
>*** End Of Quoted Material ***
>
>>>
>>>I have yet to be convinced, I'm afraid. Firstly, on their web site, SSDF say
>>>they have done some research to ensure that their rating ranges are reasonably

This was some time ago, plus they also state that "as programs get better it
becomes increasingly difficult to arrange meaningful games against human
players." Furthermore, a program's performance in Aegon is a TPR, not a rating,
as there were too few games played. I think Shroeder's GM challenge is revealing
the reality of the program's relative strength versus IMs and GMs.

                                    Albert Silver

>>>accurate. In the past, for example, they have used the Aegon tournament to check
>>>the validity of their rating ranges.
>>>
>>>Secondly, much of the argument revolved around the idea that computers are prone
>>>to making moves which are weak from the positional perspective - and that only 1
>>>such weak move is needed to lose a game with a grandmaster. However, I would
>>>question this for the following reasons:
>>>
>>>* Computers have a remarkably good ability to survive the resulting "crushing"
>>>attacks. Sometimes, when they find an escape, they are able to go on and win the
>>>game
>>>
>>>* IMs and above tend to divide themselves into "active" players (e.g. Maurice
>>>Ashley) and "positional" players (e.g. Yasser Sierewan, Anatoly Karpov).
>>>Certainly players like Yasser were, in the past, able to beat computers (Yasser
>>>is a previous winner of Aegon). But players like Kasparov (who tends to lose to
>>>computers) must have all (or most) of the positional players' knowledge, because
>>>his Elo rating is so much higher than theirs.
>>
>>
>>Kasparov doesn't tend to lose to computers, excepting one match vs DB.
>
>He lost a G25 match against Genius in London in 1994 - and Kasparov had the
>white pieces! Worse still, the computer, from memory, was only a Pentium 90 -
>not even a Pentium II!
>
>Gary's problem seems to be his style: he is a brilliant tactician, who is not
>usually frightened of seeing a complicated malaise in from of him. But Deeper
>Blue (and the other computers) have taught him to be afraid - very afraid - of
>tactics when he's up against a machine.
>
>>>
>>>To organise another Aegon style tournament would probably cost about $120,000
>>>and it's entirely possible that, because IBM have basically milked much of the
>>>publicity available for human v computer chess, that sponsorship would be very
>>>difficult to obtain. So, for the time being, we're stuck with jumping on every
>>>little scrap of information to try to create a (moving) picture of what the
>>>reality of the ratings is like.
>>>
>>>Graham
>>
>>
>>Ed is providing some reasonable data, although it is taking time to get enough
>>games to draw conclusions.  But at least in another year or so we will have
>>some vague notion about the FIDE rating Rebel might be playing at.
>
>Then SSDF will have to rate his program. At the moment, he's asked for his
>programs not to be rated by SSDF. Ed seems to have a fear of competition. He has
>long since stopped entering computer tournaments. He's not afraid to compete
>against top players because, at the moment, he's the only one doing it.
>
>Graham



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.