Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: BFF Rating List: 2 Thoughts

Author: Mike Hood

Date: 16:29:28 12/29/03

Go up one level in this thread


On December 29, 2003 at 18:19:36, Uri Blass wrote:

>On December 29, 2003 at 17:39:08, Mike Hood wrote:
>
>>On December 29, 2003 at 16:48:36, Uri Blass wrote:
>>
>>>On December 29, 2003 at 16:18:20, Mike Hood wrote:
>>>
>>>>Take a look at the end-of-year Best-for-Fritz rating list, for engines running
>>>>in the Fritz GUI, at
>>>>http://www.beepworld.de/members39/computerschach2/bff-liste.htm
>>>>
>>>>I have two "doubts" about the list:
>>>>
>>>>1) At the bottom of the list is the Chessbase native engine Turing, with a
>>>>rating of 1572. This seems horribly inflated to me. My own "official" rating,
>>>>based on my league games, is 1430. I played a series of games against Turing and
>>>>won 8-0, no draws. My personal estimate for Turing is between 1000 and 1200. If
>>>>you can't trust the Elo values at the bottom of the list, how can you trust the
>>>>values at the top of the list? Maybe the arbitrary start value of 2600 was too
>>>>high. If a start value of 2400, or even 2200, had been used, a more meaningful
>>>>rating list would have been achieved.
>>>
>>>No
>>>
>>>I think that the difference in rating is simply different than the difference in
>>>human rating.
>>>
>>>computers are different pool and you cannot use the rating to compare to human
>>>rating.
>>>
>>>Uri
>>
>>Uri, I've heard all the arguments before... "Elo values are only relative values
>>within a pool", "Elo numbers have no absolute worth", etc... That's the theory.
>>But that's not how Elo ratings are used in practice. People say things like
>>"Shredder has an Elo rating of 2750, which is only 80 points less than Gary
>>Kasparov", so Elo values are used as absolute comparisons. My own rating of (I'm
>>ashamed to say it) 1430 is based on games against other players in my home town,
>>but it's been calibrated by games played by a few players in my town against
>>other players in England, and the ratings of the whole of the English players
>>have been calibrated by the games that the top English players have played
>>against chess players from other countries. If Elo ratings are used as a
>>measuring stick, they only have any value if one pool of players is calibrated
>>against another. The ratings of the pool of computer programs has no value
>>unless it is calibrated against the pool of human players. The SSDF claims to do
>>this. (Does it really?) BfF does not do this. So what is the solution? Introduce
>>me as a 1430 player into the BfF pool, and recalculate the list. I know I'm
>>being crass, but when I see an engine like Turing with all its beginner's errors
>>being rated 140 points higher than me I know that something is wrong.
>
>It does not mean that all the engines should go down like turing and it is
>possible that Turing should go down by 200 elo when Crafty should go up by 100
>elo if you want to compare to humans.
>
>I say that with different pools difference in rating may be different for the
>same levels.
>
>I believe that the difference in rating is also dependent on who is playing
>against who and you may have different difference in rating if the average
>difference between 2 players who play is 50 elo relative to the case that the
>average difference between 2 players who play is 100 elo.
>
>Uri

I think I understand what you're saying, Uri. I have a 100% record in games
against Turing, which would, if taken in isolation, reduce Turing's Elo rating
to -170. (As far as I know, in the Elo rating system, if a pool consists of only
two players, and player A always beats player B, player A has 1600 points more
than player B. Correct?) But there are other players ("engines") in the pool.
Turing might occasionally beat engines that I would rarely draw against, so
Turing's overall rating might be higher than mine. Is this what you mean? If so,
it's a valid point, but I still think that the BfF list needs to be calibrated
by games against humans of different playing strengths.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.