Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: BFF Rating List: 2 Thoughts

Author: Uri Blass

Date: 15:19:36 12/29/03

Go up one level in this thread


On December 29, 2003 at 17:39:08, Mike Hood wrote:

>On December 29, 2003 at 16:48:36, Uri Blass wrote:
>
>>On December 29, 2003 at 16:18:20, Mike Hood wrote:
>>
>>>Take a look at the end-of-year Best-for-Fritz rating list, for engines running
>>>in the Fritz GUI, at
>>>http://www.beepworld.de/members39/computerschach2/bff-liste.htm
>>>
>>>I have two "doubts" about the list:
>>>
>>>1) At the bottom of the list is the Chessbase native engine Turing, with a
>>>rating of 1572. This seems horribly inflated to me. My own "official" rating,
>>>based on my league games, is 1430. I played a series of games against Turing and
>>>won 8-0, no draws. My personal estimate for Turing is between 1000 and 1200. If
>>>you can't trust the Elo values at the bottom of the list, how can you trust the
>>>values at the top of the list? Maybe the arbitrary start value of 2600 was too
>>>high. If a start value of 2400, or even 2200, had been used, a more meaningful
>>>rating list would have been achieved.
>>
>>No
>>
>>I think that the difference in rating is simply different than the difference in
>>human rating.
>>
>>computers are different pool and you cannot use the rating to compare to human
>>rating.
>>
>>Uri
>
>Uri, I've heard all the arguments before... "Elo values are only relative values
>within a pool", "Elo numbers have no absolute worth", etc... That's the theory.
>But that's not how Elo ratings are used in practice. People say things like
>"Shredder has an Elo rating of 2750, which is only 80 points less than Gary
>Kasparov", so Elo values are used as absolute comparisons. My own rating of (I'm
>ashamed to say it) 1430 is based on games against other players in my home town,
>but it's been calibrated by games played by a few players in my town against
>other players in England, and the ratings of the whole of the English players
>have been calibrated by the games that the top English players have played
>against chess players from other countries. If Elo ratings are used as a
>measuring stick, they only have any value if one pool of players is calibrated
>against another. The ratings of the pool of computer programs has no value
>unless it is calibrated against the pool of human players. The SSDF claims to do
>this. (Does it really?) BfF does not do this. So what is the solution? Introduce
>me as a 1430 player into the BfF pool, and recalculate the list. I know I'm
>being crass, but when I see an engine like Turing with all its beginner's errors
>being rated 140 points higher than me I know that something is wrong.

It does not mean that all the engines should go down like turing and it is
possible that Turing should go down by 200 elo when Crafty should go up by 100
elo if you want to compare to humans.

I say that with different pools difference in rating may be different for the
same levels.

I believe that the difference in rating is also dependent on who is playing
against who and you may have different difference in rating if the average
difference between 2 players who play is 50 elo relative to the case that the
average difference between 2 players who play is 100 elo.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.