Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A theory of ratings drift for the SSDF

Author: Robert Hyatt

Date: 06:48:13 04/11/02

Go up one level in this thread


On April 10, 2002 at 17:58:02, Sune Fischer wrote:

>On April 10, 2002 at 16:13:23, Robert Hyatt wrote:
>
>>On April 09, 2002 at 16:04:01, Dann Corbit wrote:
>>
>>
>>one quick note.  You are falling into the same "trap" that 99% of the
>>people here fall into... treating the "rating" as "absolute".  It is not.
>>You should compare the rating of (say) 1996 chessmaster to 1996 genius,
>>then compare the 2002 ratings for both and see if the "spread" has
>>changed much.  If it has, something is wrong.  If it has not, then the
>>Elo system is working perfectly...
>>
>>The absolute rating probably should drop since new and more skilled players
>>are entering the "pool" each year...  But the spread between two programs
>>should not change significantly...
>
>Why would the spread change if they still use the same formula?


Because the _pool_ has changed.  The "new" programs will _necessarily_ be
stronger than the old.  And with Elo, there is a "conservation of rating pool"
built in...  both players get their ratings adjusted by the same amount, but
with "opposite sign".

But the spread must necessarily stay the same for two players that have a
constant probability of beating each other.  Although you could add 1000 to
every pool player's rating and things would continue to work just fine.  In
fact, you can adjust everyone's rating by a single constant without changing
a thing.  The statistics still work just fine...



>The difference in elo between players is just related to the win/lose ratio
>between them, so the spread should stay fixed if the win/lose ratio remains the
>same.
>
>Of cause the scale could drift up or down, but since programs perform at a
>constant level, we do have a tool to correct for that.
>As I suggested in a different post, one could simply take a group of programs,
>find their average and make sure that average remains constant.
>It would be far better with a large group than just one or two programs, much
>smaller errorbars on the "absoluteness" of the scale.

Finding the "average" is statistically invalid.  Elo's formula is _only_
interested in the difference in rating between two players.  The absolute
value doesn't mean a thing.  The average of these values also doesn't mean a
thing...  This is why you should _never_ try to equate SSDF ratings to FIDE
ratings.  The pools are different.  The values are different.  They mean nothing
outside their own pool...




>
>-S.
>>



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.