Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Chess Tiger - Is It Really 2696 ELO?

Author: Robert Hyatt

Date: 07:00:07 12/22/99

Go up one level in this thread


On December 22, 1999 at 08:16:55, Graham Laight wrote:

>On December 21, 1999 at 23:16:18, Robert Hyatt wrote:
>
>>On December 21, 1999 at 21:23:57, Graham Laight wrote:
>>
>>>On December 21, 1999 at 18:10:05, Fernando Villegas wrote:
>>>
>>>>On December 21, 1999 at 13:15:11, Graham Laight wrote:
>>>>
>>>>>I apologise for bringing up a subject which has undoubtedly already been
>>>>>discussed, but according to the SSDF ratings, Chess Tiger is 2696.
>>>>>
>>>>>According to the FIDE ratings, there are only 11 players in the world with a
>>>>>higher rating than this.
>>>>>
>>>>>Can this possibly be correct?
>>>>>
>>>>>Graham
>>>>
>>>>As it has been said before, Elo rating between computers are valid in he
>>>>communuty of computers and has a not very clear and perhaps definitively dark
>>>>relation with Elo of human players. In fact, there is not any known method to
>>>>determinate that relation, until now. Only guesses. If monkeys played chess,
>>>>they too would have an elo rating, but I am sure you would not equate the elo of
>>>>Sheeta with that of Gary. Sorry for the monkeys
>>>>Fernando
>>>
>>>If monkeys could play chess, their Elo rating would be very low - so they would
>>>be comparable to Gary. Monkey Elo would probably be about 100, Gary's is over
>>>2800.
>>>
>>
>>
>>not true.  If the monkeys _only_ play other monkeys, some could easily have
>>ratings over 2800.  Of course that would have nothing to do with FIDE
>>ratings...
>>
>>
>>>Following the link on Albert Silver's post to the previous discussion, it
>>>appears that Albert (and others) are saying the same thing - that because you're
>>>not comparing like with like, the computer Elo ratings are not valid.
>>
>>No, that isn't what he said. He said that computer (SSDF) ratings are
>>perfectly valid.  But they have _nothing_ to do with FIDE ratings.  Other
>>than both are 4 digit base ten numbers.  The monkey rating would have nothing
>>to do with either of these either, unless the monkeys played in the same pool
>>of players with one of the two groups (FIDE players or SSDF-tested computer
>>programs).
>
>SSDF believe that there is sufficient play between the pool of humans and the
>pool of computers to be able to say that their ratings are valid with respect to
>Swedish (and hence Fide) human Elo ratings.
>
>At the risk of being labelled "naughty", I have copied the following paragraphs
>directly from the FAQ page on the SSDF web site:
>
>*** Start Of Quoted Material***
>
>Q: How are the ratings calculated?
>
>A: SSDF uses its own rating program, written by our member Lars Hjorth, but the
>basic formulas are derived from Arpad Elo's ELO rating system. Our program
>calculates, for each computer, the average rating for its opponents and how many
>points it has scored. Given those two numbers, professor Elo's formulas produces
>a rating.
>
>However, if all computers are only tested against other computers, all we get is
>a relative rating that is just valid among those computers. Therefore, SSDF has
>played several hundred games between computers and human players in serious
>tournaments and used these results to set a "correct" absolute level for the
>rating list according to Swedish conditions. Different national rating systems
>are not completely in accordance though, and that has to be taken into account
>when reading our list. For instance, US ratings seems to lie approximately 50
>points above the corresponding Swedish ratings (maybe more when below 2000 and
>less on the other side of the scale). For ourselves we obviously use the Swedish
>scale.



That ought to be removed from their faq.  That was done in the 1980's, and
has _not_ been done since.  Do you think there is still any vestiges of those
old human ratings left, since for the past 10+ years they have _only_ included
computer vs computer games in their ratings?  For this to be valid as a
comparison between SSDF and FIDE ratings, they would have to _regularly_ play
several of the programs in FIDE events and rate them in their 'pool' by using
the results of those games + the comp/comp games.  They don't do this.





>
>We firmly believe that our ratings are correct in the sense that if a computer
>were to play a sufficient number of games against Swedish humans, it would end
>up with a rating close to what it has on our list. Unfortunately, as programs
>get better it becomes increasingly difficult to arrange meaningful games against
>human players. Reassuringly, we've noted that our ratings are fairly consistent
>with the results from the yearly Aegon tournament in Holland.


Baloney nowadays.  No program would consistently play at near 2700 at
aegon.




>
>*** End Of Quoted Material ***
>
>>>
>>>I have yet to be convinced, I'm afraid. Firstly, on their web site, SSDF say
>>>they have done some research to ensure that their rating ranges are reasonably
>>>accurate. In the past, for example, they have used the Aegon tournament to check
>>>the validity of their rating ranges.
>>>
>>>Secondly, much of the argument revolved around the idea that computers are prone
>>>to making moves which are weak from the positional perspective - and that only 1
>>>such weak move is needed to lose a game with a grandmaster. However, I would
>>>question this for the following reasons:
>>>
>>>* Computers have a remarkably good ability to survive the resulting "crushing"
>>>attacks. Sometimes, when they find an escape, they are able to go on and win the
>>>game
>>>
>>>* IMs and above tend to divide themselves into "active" players (e.g. Maurice
>>>Ashley) and "positional" players (e.g. Yasser Sierewan, Anatoly Karpov).
>>>Certainly players like Yasser were, in the past, able to beat computers (Yasser
>>>is a previous winner of Aegon). But players like Kasparov (who tends to lose to
>>>computers) must have all (or most) of the positional players' knowledge, because
>>>his Elo rating is so much higher than theirs.
>>
>>
>>Kasparov doesn't tend to lose to computers, excepting one match vs DB.
>
>He lost a G25 match against Genius in London in 1994 - and Kasparov had the
>white pieces! Worse still, the computer, from memory, was only a Pentium 90 -
>not even a Pentium II!
>
>Gary's problem seems to be his style: he is a brilliant tactician, who is not
>usually frightened of seeing a complicated malaise in from of him. But Deeper
>Blue (and the other computers) have taught him to be afraid - very afraid - of
>tactics when he's up against a machine.
>
>>>
>>>To organise another Aegon style tournament would probably cost about $120,000
>>>and it's entirely possible that, because IBM have basically milked much of the
>>>publicity available for human v computer chess, that sponsorship would be very
>>>difficult to obtain. So, for the time being, we're stuck with jumping on every
>>>little scrap of information to try to create a (moving) picture of what the
>>>reality of the ratings is like.
>>>
>>>Graham
>>
>>
>>Ed is providing some reasonable data, although it is taking time to get enough
>>games to draw conclusions.  But at least in another year or so we will have
>>some vague notion about the FIDE rating Rebel might be playing at.
>
>Then SSDF will have to rate his program. At the moment, he's asked for his
>programs not to be rated by SSDF. Ed seems to have a fear of competition. He has
>long since stopped entering computer tournaments. He's not afraid to compete
>against top players because, at the moment, he's the only one doing it.
>
>Graham


I don't believe he has a fear of competing.  I think he had a fear of abuse,
as it seems that some autoplayer issues were surfacing that indicated a bit of
foul play was possible...



This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.