Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: BFF Rating List: 2 Thoughts

Author: Mike Hood

Date: 14:39:08 12/29/03

Go up one level in this thread


On December 29, 2003 at 16:48:36, Uri Blass wrote:

>On December 29, 2003 at 16:18:20, Mike Hood wrote:
>
>>Take a look at the end-of-year Best-for-Fritz rating list, for engines running
>>in the Fritz GUI, at
>>http://www.beepworld.de/members39/computerschach2/bff-liste.htm
>>
>>I have two "doubts" about the list:
>>
>>1) At the bottom of the list is the Chessbase native engine Turing, with a
>>rating of 1572. This seems horribly inflated to me. My own "official" rating,
>>based on my league games, is 1430. I played a series of games against Turing and
>>won 8-0, no draws. My personal estimate for Turing is between 1000 and 1200. If
>>you can't trust the Elo values at the bottom of the list, how can you trust the
>>values at the top of the list? Maybe the arbitrary start value of 2600 was too
>>high. If a start value of 2400, or even 2200, had been used, a more meaningful
>>rating list would have been achieved.
>
>No
>
>I think that the difference in rating is simply different than the difference in
>human rating.
>
>computers are different pool and you cannot use the rating to compare to human
>rating.
>
>Uri

Uri, I've heard all the arguments before... "Elo values are only relative values
within a pool", "Elo numbers have no absolute worth", etc... That's the theory.
But that's not how Elo ratings are used in practice. People say things like
"Shredder has an Elo rating of 2750, which is only 80 points less than Gary
Kasparov", so Elo values are used as absolute comparisons. My own rating of (I'm
ashamed to say it) 1430 is based on games against other players in my home town,
but it's been calibrated by games played by a few players in my town against
other players in England, and the ratings of the whole of the English players
have been calibrated by the games that the top English players have played
against chess players from other countries. If Elo ratings are used as a
measuring stick, they only have any value if one pool of players is calibrated
against another. The ratings of the pool of computer programs has no value
unless it is calibrated against the pool of human players. The SSDF claims to do
this. (Does it really?) BfF does not do this. So what is the solution? Introduce
me as a 1430 player into the BfF pool, and recalculate the list. I know I'm
being crass, but when I see an engine like Turing with all its beginner's errors
being rated 140 points higher than me I know that something is wrong.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.