Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A theory of ratings drift for the SSDF

Author: Robert Hyatt

Date: 21:16:35 04/11/02

Go up one level in this thread


On April 11, 2002 at 15:22:23, Sune Fischer wrote:

>On April 11, 2002 at 14:20:51, Robert Hyatt wrote:
>
>
>>
>>Perhaps I missed something along the way, because I didn't see any "spread"
>>change at all...  although it is certainly possible that the spread can
>>change as stronger players enter the pool.  because the spread is a statistical
>>prediction between two players in the pool, but it is an average for _all_ the
>>players.  If you change the pool of players in any way, the average rating can
>>change and the spread can change.  The former more than the latter of course.
>>
>
>And that was my point, that the "spread" won't change at all, only the average.




The "spread" much change some.  Because the "spread" is an average based on
the entire pool of players.  You change the pool, the spread will alter a
bit because both player's ratings are influenced by _all_ players in the pool,
in different ways...

It shouldn't change a lot, but it can change some, obviously...  just based
on raw sampling theory..



>>>>But the spread must necessarily stay the same for two players that have a
>>>>constant probability of beating each other.  Although you could add 1000 to
>>>>every pool player's rating and things would continue to work just fine.  In
>>>>fact, you can adjust everyone's rating by a single constant without changing
>>>>a thing.  The statistics still work just fine...
>>>
>>>If the "spread" is the same for any two players, how can the spread then change
>>>at all? :)
>>
>>As I said, the spread should _not_ change.  But with new and stronger players
>>in the pool, they are going to "squash" everyone else down, while they climb,
>>since the points within the pool remains constant more or less...
>
>I don't understand what you are saying, the spread doesn't change but it changes
>anyway..?
>Stronger players entering the pool shouldn't, by the time everything has reached
>equilibrium, affect the "spread" at all.
>

It _must_ affect both the rating of other players, and the "spread" to some
small degree.  A drop of water hitting the pond changes the depth everywhere
as the ripples propogate around and reflect...  Remember that the spread between
two player's ratings is a statistical average of how the two players do against
all other players in the pool.  Adding a new player can change this.  IE a
player joined our local club back in 1970, and he was rated 300 points above
me, yet I won the majority of games against him because our "styles" gave me
an advantage...  that obviously changed the spread between me and other players
in the club, yet I did no better (or worse) against them after the new player
arrived...




>>
>>>
>>>>
>>>>
>>>>>The difference in elo between players is just related to the win/lose ratio
>>>>>between them, so the spread should stay fixed if the win/lose ratio remains the
>>>>>same.
>>>>>
>>>>>Of cause the scale could drift up or down, but since programs perform at a
>>>>>constant level, we do have a tool to correct for that.
>>>>>As I suggested in a different post, one could simply take a group of programs,
>>>>>find their average and make sure that average remains constant.
>>>>>It would be far better with a large group than just one or two programs, much
>>>>>smaller errorbars on the "absoluteness" of the scale.
>>>>
>>>>Finding the "average" is statistically invalid.  Elo's formula is _only_
>>>>interested in the difference in rating between two players.  The absolute
>>>>value doesn't mean a thing.
>>>
>>>It doesn't mean a thing _now_. But it could be absolute by the adjustment I
>>>suggested :)
>>
>>It will _never_ mean anything.  The absolute rating is meaningless and is
>>totally arbitrary in the first place.  Just start everyone at 10000 elo and
>>things _still_ work perfectly.  Trying to normalize between two pools is
>>not easy.  Trying to normalize between more than two pools is impossible...
>>And the SSDF represents dozens of pools since new players are added each
>>year...
>
>Meaningless?
>Depends on what you put into the word I guess.
>If we had one single scale by which to compare all players, computers, FIDE,
>USCF, ICC etc., then I wouldn't call that meaningless at all.
>Meaningless is when every pool of players has their own scale.
>
>"Absolutness" would just be a definition, just as time is a definition and
>thereby nodes per second and miles per hour and everything else, is all that
>meaningless?
>We define something we all agree upon and it become meaningfull.
>
>>>
>>>
>>>> The average of these values also doesn't mean a
>>>>thing...  This is why you should _never_ try to equate SSDF ratings to FIDE
>>>>ratings.  The pools are different.  The values are different.  They mean nothing
>>>>outside their own pool...
>>>>
>>>
>>>They only differ by and added constant, letting a designated group of programs
>>>get a rating in both pools, we could calibrate the scales to give the group the
>>>same average.
>>>Same spread (if same formula) plus same average -> I'd say same scale.
>>>
>>>What am I missing here, isn't it that simple?
>>
>>No.  This falls under sampling theory...  The "why" is a complex question to
>>answer.  But averages of averages is not useful...
>
>"Sampling theory" you mean the Nyquist frequency or something? :)
>I think it is that simple.
>
>-S.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.