Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Is there a rating inflation?

Author: Chris Carson

Date: 11:52:15 06/03/02

Go up one level in this thread


On June 03, 2002 at 14:39:22, Sune Fischer wrote:

>On June 03, 2002 at 13:22:09, Chris Carson wrote:
>
>>On June 03, 2002 at 12:30:53, Sune Fischer wrote:
>>
>>>
>>>yes, if you could do a hypothesis testing, of course, but how do you do that?
>>>I hypothesise there has been a slight inflation in rating, but also a slight
>>>increase in average strength of top playes. This means we should subtract a bit
>>>from the current ratings of players, but their mean should still be higher.
>>>How do you design a test to confirm this hypothesis?
>>
>>compare means between 1972 GM ratings and 2002 GM ratings.  Since you use
>>slight, I will assume you do not care about significance, although you can
>>determine this with a t-test.  The t-test is one stat test to help confirm your
>>hypothesis (there are others you would actuall do for a more detailed analysis).
>
>I don't follow how you want to apply the t-test here.
>It will show you how one rating system correlates to another, but not how the
>underlying strengths correlate, which is what is interesting.
>
>>If you think there has been ratings inflation, then by definition you are
>>comparing the ratings of 2002 with 1972 (or whatever date you choose).  Small
>>changes up or down over time may not be significant.
>
>That is my belief, but I have no way of proving that since comparing two rating
>lists that have very bad correlation doesn't make sense to me.
>Maybe there are statistical tricks that will patch things up, ie. reestablish
>the correlation. But you need something more than just the elo-scales to do
>that.
>
>>>
>>>You can't very well ask the players from the past to solve a given testset of
>>>positions...
>>>You need some *fixpoint*, some universial scale to match up against, so far we
>>>have been unable to design such an scale.
>>
>>Here is where we disagree, the FIDE ELO scale can be used.  Yes the membership
>>will change, but the rate of change is slow and provides a good measure.  My
>>guess is you disagree.  Again let me encourage you to go learn how to study
>>humans over time (longitudinal studies).
>
>Yes this is where we disagree.
>You assume it's valid, that there is a good correlation and therefore you can do
>tests. But this is an assumtion that puts you very close to what you want to
>prove, the proof is in your assumtion, not in the statistics AFAIK.
>
>If I give you two random distributions, what do you expect the t-test will show
>you?
>You have
>Elo_1970(strength)=F(T_1970(strength)) and
>Elo_2002(strength)=G(T_2002(strength)), now F(T(..)) and G(T(..)) are known
>distributions, namely the ratinglists.
>But we want to find how the strength evolved in time, how do we do that?
>
>If you treat F,G and T as unknowns (as I do), then you will get nowhere in you
>analysis, you need to make assumtions or approksimations, that is unless I'm
>overlooking something ;)
>
>>There are other subjective ways to measure strength.  I like the more objective
>>ELO comparison, if you do not, then don't use it.
>
>The elo scale is fine, but it only works in the here and now.
>
>>>Please tell me how to compare strengths when the elo scale is useless?
>>
>>I disagree that it is useless.  Why would you want to throw it out?
>
>If you have a method that is better, then what do we need it for?
>The elo is already the best we have, by definition.
>
>I think it would be possible to make a better scale than elo, starting now, but
>I'm not sure we could extrapolate it backwards in time.
>
>-S.

Well, we just disagree.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.