Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Rating in ICC is meaningless and here is an example

Author: Robert Hyatt

Date: 17:47:46 01/14/03

Go up one level in this thread


On January 14, 2003 at 19:47:21, Uri Blass wrote:

>On January 14, 2003 at 19:12:21, Miguel A. Ballicora wrote:
>
>>On January 14, 2003 at 18:25:39, Robert Hyatt wrote:
>>
>>>On January 14, 2003 at 18:09:35, Miguel A. Ballicora wrote:
>>>
>>>>On January 14, 2003 at 16:28:03, Robert Hyatt wrote:
>>>>
>>>>>On January 14, 2003 at 15:56:04, Miguel A. Ballicora wrote:
>>>>>
>>>>>>On January 14, 2003 at 14:53:39, Robert Hyatt wrote:
>>>>>>
>>>>>>>On January 14, 2003 at 12:35:02, Miguel A. Ballicora wrote:
>>>>>>>
>>>>>>>>On January 14, 2003 at 10:55:38, Andrew Williams wrote:
>>>>>>>>
>>>>>>>>>On January 14, 2003 at 10:43:20, Uri Blass wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>{Game 494 (MoveiXX vs. ACCIDENTE) ACCIDENTE resigns} 1-0
>>>>>>>>>>Blitz rating adjustment: 2635 --> 2602
>>>>>>>>>>
>>>>>>>>>>Movei won a game and lost rating.
>>>>>>>>>>
>>>>>>>>>>Uri
>>>>>>>>>
>>>>>>>>>It seems a bit strange when moveixx has played a total of *thirteen* games to
>>>>>>>>>declare that the rating system is "meaningless". What you have observed only
>>>>>>>>>occurs in the first few games. I've forgotten now how many games it requires
>>>>>>>>>before it settles down.
>>>>>>>>
>>>>>>>>Uri is poiting out a flaw.
>>>>>>>>The point that happen when one is provisional does not make it less serious.
>>>>>>>>After 20 games you could end up with a very wrong rating, suppose that you
>>>>>>>>played all 1000 -1500 elo players and won all of them. Later, you will lots of
>>>>>>>>points from the rating pool causing deflation. Overall, I think that introduces
>>>>>>>>a lot of noise. However, considering all the mess regarding these ratings, this
>>>>>>>>point is not one of the worst.
>>>>>>>>
>>>>>>>>Miguel
>>>>>>>
>>>>>>>This is _not_ a "flaw".
>>>>>>
>>>>>>It is not a flaw, it is a major screw up considering how uneven is the
>>>>>>population of players in ICC.
>>>>>
>>>>>It isn't a flaw, nor a major screw-up.  How about giving some good algorithm
>>>>>to develop an approximate rating for a new player?
>>>>
>>>>There are many options to do it. For instance, you do not need to approximate.
>>>>It is quite silly in the era of the computers to use paper and pencil
>>>>approximations that Dr. Elo _had_ to do decades ago.
>>>
>>>I'm waiting on a real suggestion.  You play one game and beat a 1200 player.
>>
>>Uri gave you one, I gave you one. I elaborate more below.
>>
>>>What is your rating?  You play another game and lose to a 1200 player.  What
>>>is your rating?
>>>
>>>You _must_ start somewhere...  And the only place you can start is by using
>>>the ratings of the two players you have played, along with the results, to start
>>>a first approximation to your rating.
>>
>>
>>You do not understand. I am not talking about an approximation to the rating of
>>the player, I am talking that it should be used an approximation of the formula
>>used by elo (using averages of ratings) and real formulas should be used.
>>
>>
>>>>>BTW you do know that just because a new player's rating fluctuates wildly,
>>>>>his opponents do _not_ get all those points added or subtracted from _their_
>>>>>ratings?
>>>>
>>>>>>It is based on an approximation. Every approximation works between certain
>>>>>>boundaries.
>>>>>>
>>>>>>>For the first 20 games, you use a "provisional rating formula" and you can lose
>>>>>>>points by winning if you play a much lower-rated player.  USCF does this.
>>>>>>>_everybody_ does it as you have to get an initial rating from somewhere.
>>>>>>
>>>>>>USCF does that, that one of the reason why initial ratings in many cases are
>>>>>>horrible and there were many cases of cheating because of this. For instance,
>>>>>>kids that play only against 2000 rated people and their initial rating is 1600.
>>>>>
>>>>>What else would you propose?  There is no solution.  Criticizing the _only_
>>>>>solution
>>>>>makes little sense IMHO.
>>>>
>>>>What makes you think that this is the only solution?
>>>>There are many rating systems around!
>>>
>>>I'm waiting for a suggestion for the _initial rating_.  All rating systems I
>>>know
>>>of use a TPR-type approximation to seed initial rating values.
>>
>>>>Even the simple solution proposed by Uri deserves consideration: not to take
>>>>into account games were the average elo of A is >400 points than B.
>>>>
>>>>The one I could propose is you take the pool of players that you played and
>>>>calculate what is the Elo that would give you the same amount of points that you
>>>>obtained, doing the calculation "game by game", not by a crude average. For
>>>>that, you need to iterate and that is the reason why most probably was never
>>>>used at the beginning.
>>>>
>>>
>>>Er... that is what the TPR approximates, in fact.  Which is _the_ point here.
>>
>>No, it is not. Classically the TPR is calculated from the average of your
>>opposition, which is an approximation. (Still that is better than the crude
>>average of your opposition, though). What I am saying is that you calculate it
>>game by game. The problem is that you have to do it iteratively. Today, that is
>>not a problem.
>>
>>What is your rating if you play
>>1) 2600 draw
>>2) 2600 draw
>>3) 2600 draw
>>4) 2000 win
>>5) 2000 win
>>6) 2000 win
>>
>>6 games, 4.5 points. Common sense indicate that your rating should be a tiny bit
>>slightly above 2600. If you calculate it by the USCF method it is 2500.
>>Horrible.
>>
>>Now do this:
>>Ask, what is the expectancy for a 2300 player?
>>2300-2000 => +300 --> 0.85 (IIRC)
>>2300-2600 => -300 --> 0.15
>>
>>1) 0.15
>>2) 0.15
>>3) 0.15
>>4) 0.85
>>5) 0.85
>>6) 0.85
>>   3.00 = total.
>>
>>So, the expectancy will be 3.0/6.0 points that means that your rating is higher
>>than 2300 since you got 4.5.
>>Ask the same question for a 2400 player. Nope, it should be higher, what about a
>>2700? nope, to high. Iterate until you find the answer. It will be slightly
>>higher than 2600, as it should be.
>>
>>That means doing it game by game, not from the average.
>>
>>>To do it any other way distorts the statistical significance.
>>>
>>>>Lots of things can be done.
>>>>
>>>>>>That is one of the reasons why when I started to play in US, my initial rating
>>>>>>was way below the one that I should have had (personally I do not give a damn)
>>>>>>because I played tournaments in the area against nobody. That is also the reason
>>>>>>why Anatoly Karpov was rated (maybe still is) 2500 in USA. Ridiculous.
>>>>>
>>>>>You do realize that your rating reflects your results in a rating pool?  Once
>>>>>again
>>>>>you are using a local rating to compare with ratings from other pools.  It is
>>>>>statistically invalid to do this.
>>>>
>>>>You are assuming, that I compared my elo somewhere else with the elo that I got
>>>>in USCF and I was not happy. No, I compared the elo that I got with the elo of
>>>>other people who played worse than me here in US. It took me a _long_ time until
>>>>that was reversed and still my elo did not reach a balance. Partially, because
>>>>it is difficult to increase you elo fast when you play opposition that is weaker
>>>>than you.
>>>
>>>That is what the statistics involved produces.  And it is a _desired_ effect, in
>>>fact.
>>>Otherwise you could beat nobodys and produce a huge rating.
>>>
>>>
>>>
>>>>Besides, if I did the comparison USCF ratings are slightly overrated compared to
>>>>FIDE so even if I did, I was not wrong. I was really tired of listening to my
>>>>opponents saying: Are you really 2050?
>>>>
>>>>Karpov 2596? Come on!!! He played the US Amateur and beat a couple of players
>>>>with a very low rating and that was the result. Yes, 6 games, but he won all of
>>>>them.
>>>>http://www.64.com/uscf/ratings/12730227
>>>
>>>So?  You can't re-write the statistics to produce a result you want for a
>>>special
>>>case...  I believe that USCF uses a FIDE rating as the initial rating if the
>>
>>No, it is not a special case. The case I am pointing out it shows the flaw.
>>Karpov is not 2596 in '98. What did he do wrong? accepted to play against a
>>couple of low rated people that screwed the average.
>>Bruce Moreland pointed the flaw in another message. You can really inflate your
>>rating if you play against strong people at the beginning, or deflate yours if
>>you play only very weak ones. It is enough if you include enough weak to throw
>>your average to the bottom.
>>
>>Miguel
>
>Of course you are right and there may be even improvements.
>
>The system does not use result of other players to change the rating.
>
>Suppose that your opponent win all the games and improve his rating after a game
>against you.
>I think that it may be logical to use the information to increase your rating
>because it is logical to assume that the rating for your opponent was wrong but
>the system does not do it.
>
>I do not suggest exactly how to do it and it seems that the problem does not
>interest the ICC.


ICC is not interested.  FIDE is not interested.  USCF is not interested. In
fact, _no_ chess federation I know of does the initial rating differently than
what is done today.

You are taking a rating as an absolute value.  It is _not_.  It is an
estimate of how you would do against the group (pool) of players you compete
in.  The current TPR approach is _exact_ in that regard.  Even if it has
nothing to do with how you would do against other players.  I have _yet_ to
see anyone suggest an alternative.  Just complaints about how it is done now.

Without suggestions on a better way, complaints are not very useful...

"I don't like that, fix it" is _not_ going to produce changes.



>
>
>If ICC cares about creating a better rating system
>They can give give 10000$ for the people who find the best rating system.

I believe Elo did that a _long_ time back.  It has certainly stood the
"test of time".


>
>It seems that they do not care so they will not do it.
>
>I have also definition that we can compare based on it different rating systems.
>A rating system should give the expected result in every game.
>It is easy to use the sum of squares to find the practical error and the rating
>system that gives the smallest error is the most logical rating to choose.
>
>Uri

Can you spell "Elo"???





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.