Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A theory of ratings drift for the SSDF

Author: Sune Fischer

Date: 09:09:25 04/11/02

Go up one level in this thread


On April 11, 2002 at 09:48:13, Robert Hyatt wrote:

>On April 10, 2002 at 17:58:02, Sune Fischer wrote:
>
>>On April 10, 2002 at 16:13:23, Robert Hyatt wrote:
>>
>>>On April 09, 2002 at 16:04:01, Dann Corbit wrote:
>>>
>>>
>>>one quick note.  You are falling into the same "trap" that 99% of the
>>>people here fall into... treating the "rating" as "absolute".  It is not.
>>>You should compare the rating of (say) 1996 chessmaster to 1996 genius,
>>>then compare the 2002 ratings for both and see if the "spread" has
>>>changed much.  If it has, something is wrong.  If it has not, then the
>>>Elo system is working perfectly...
>>>
>>>The absolute rating probably should drop since new and more skilled players
>>>are entering the "pool" each year...  But the spread between two programs
>>>should not change significantly...
>>
>>Why would the spread change if they still use the same formula?
>
>
>Because the _pool_ has changed.  The "new" programs will _necessarily_ be
>stronger than the old.  And with Elo, there is a "conservation of rating pool"
>built in...  both players get their ratings adjusted by the same amount, but
>with "opposite sign".

I can only agree with you that the old programs would get pushed down by the
newer and stronger programs, but this the about the average dropping for the old
programs, not about the spread.

>But the spread must necessarily stay the same for two players that have a
>constant probability of beating each other.  Although you could add 1000 to
>every pool player's rating and things would continue to work just fine.  In
>fact, you can adjust everyone's rating by a single constant without changing
>a thing.  The statistics still work just fine...

If the "spread" is the same for any two players, how can the spread then change
at all? :)

>
>
>>The difference in elo between players is just related to the win/lose ratio
>>between them, so the spread should stay fixed if the win/lose ratio remains the
>>same.
>>
>>Of cause the scale could drift up or down, but since programs perform at a
>>constant level, we do have a tool to correct for that.
>>As I suggested in a different post, one could simply take a group of programs,
>>find their average and make sure that average remains constant.
>>It would be far better with a large group than just one or two programs, much
>>smaller errorbars on the "absoluteness" of the scale.
>
>Finding the "average" is statistically invalid.  Elo's formula is _only_
>interested in the difference in rating between two players.  The absolute
>value doesn't mean a thing.

It doesn't mean a thing _now_. But it could be absolute by the adjustment I
suggested :)


> The average of these values also doesn't mean a
>thing...  This is why you should _never_ try to equate SSDF ratings to FIDE
>ratings.  The pools are different.  The values are different.  They mean nothing
>outside their own pool...
>

They only differ by and added constant, letting a designated group of programs
get a rating in both pools, we could calibrate the scales to give the group the
same average.
Same spread (if same formula) plus same average -> I'd say same scale.

What am I missing here, isn't it that simple?

-S.

>>-S.
>>>



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.