Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Rating in ICC is meaningless and here is an example

Author: Uri Blass

Date: 16:47:21 01/14/03

On January 14, 2003 at 19:12:21, Miguel A. Ballicora wrote:

>On January 14, 2003 at 18:25:39, Robert Hyatt wrote:
>
>>On January 14, 2003 at 18:09:35, Miguel A. Ballicora wrote:
>>
>>>On January 14, 2003 at 16:28:03, Robert Hyatt wrote:
>>>
>>>>On January 14, 2003 at 15:56:04, Miguel A. Ballicora wrote:
>>>>
>>>>>On January 14, 2003 at 14:53:39, Robert Hyatt wrote:
>>>>>
>>>>>>On January 14, 2003 at 12:35:02, Miguel A. Ballicora wrote:
>>>>>>
>>>>>>>On January 14, 2003 at 10:55:38, Andrew Williams wrote:
>>>>>>>
>>>>>>>>On January 14, 2003 at 10:43:20, Uri Blass wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>{Game 494 (MoveiXX vs. ACCIDENTE) ACCIDENTE resigns} 1-0
>>>>>>>>>Blitz rating adjustment: 2635 --> 2602
>>>>>>>>>
>>>>>>>>>Movei won a game and lost rating.
>>>>>>>>>
>>>>>>>>>Uri
>>>>>>>>
>>>>>>>>It seems a bit strange when moveixx has played a total of *thirteen* games to
>>>>>>>>declare that the rating system is "meaningless". What you have observed only
>>>>>>>>occurs in the first few games. I've forgotten now how many games it requires
>>>>>>>>before it settles down.
>>>>>>>
>>>>>>>Uri is poiting out a flaw.
>>>>>>>The point that happen when one is provisional does not make it less serious.
>>>>>>>After 20 games you could end up with a very wrong rating, suppose that you
>>>>>>>played all 1000 -1500 elo players and won all of them. Later, you will lots of
>>>>>>>points from the rating pool causing deflation. Overall, I think that introduces
>>>>>>>a lot of noise. However, considering all the mess regarding these ratings, this
>>>>>>>point is not one of the worst.
>>>>>>>
>>>>>>>Miguel
>>>>>>
>>>>>>This is _not_ a "flaw".
>>>>>
>>>>>It is not a flaw, it is a major screw up considering how uneven is the
>>>>>population of players in ICC.
>>>>
>>>>It isn't a flaw, nor a major screw-up.  How about giving some good algorithm
>>>>to develop an approximate rating for a new player?
>>>
>>>There are many options to do it. For instance, you do not need to approximate.
>>>It is quite silly in the era of the computers to use paper and pencil
>>>approximations that Dr. Elo _had_ to do decades ago.
>>
>>I'm waiting on a real suggestion.  You play one game and beat a 1200 player.
>
>Uri gave you one, I gave you one. I elaborate more below.
>
>>What is your rating?  You play another game and lose to a 1200 player.  What
>>is your rating?
>>
>>You _must_ start somewhere...  And the only place you can start is by using
>>the ratings of the two players you have played, along with the results, to start
>>a first approximation to your rating.
>
>
>You do not understand. I am not talking about an approximation to the rating of
>the player, I am talking that it should be used an approximation of the formula
>used by elo (using averages of ratings) and real formulas should be used.
>
>
>>>>BTW you do know that just because a new player's rating fluctuates wildly,
>>>>his opponents do _not_ get all those points added or subtracted from _their_
>>>>ratings?
>>>
>>>>>It is based on an approximation. Every approximation works between certain
>>>>>boundaries.
>>>>>
>>>>>>For the first 20 games, you use a "provisional rating formula" and you can lose
>>>>>>points by winning if you play a much lower-rated player.  USCF does this.
>>>>>>_everybody_ does it as you have to get an initial rating from somewhere.
>>>>>
>>>>>USCF does that, that one of the reason why initial ratings in many cases are
>>>>>horrible and there were many cases of cheating because of this. For instance,
>>>>>kids that play only against 2000 rated people and their initial rating is 1600.
>>>>
>>>>What else would you propose?  There is no solution.  Criticizing the _only_
>>>>solution
>>>>makes little sense IMHO.
>>>
>>>What makes you think that this is the only solution?
>>>There are many rating systems around!
>>
>>I'm waiting for a suggestion for the _initial rating_.  All rating systems I
>>know
>>of use a TPR-type approximation to seed initial rating values.
>
>>>Even the simple solution proposed by Uri deserves consideration: not to take
>>>into account games were the average elo of A is >400 points than B.
>>>
>>>The one I could propose is you take the pool of players that you played and
>>>calculate what is the Elo that would give you the same amount of points that you
>>>obtained, doing the calculation "game by game", not by a crude average. For
>>>that, you need to iterate and that is the reason why most probably was never
>>>used at the beginning.
>>>
>>
>>Er... that is what the TPR approximates, in fact.  Which is _the_ point here.
>
>No, it is not. Classically the TPR is calculated from the average of your
>opposition, which is an approximation. (Still that is better than the crude
>average of your opposition, though). What I am saying is that you calculate it
>game by game. The problem is that you have to do it iteratively. Today, that is
>not a problem.
>
>What is your rating if you play
>1) 2600 draw
>2) 2600 draw
>3) 2600 draw
>4) 2000 win
>5) 2000 win
>6) 2000 win
>
>6 games, 4.5 points. Common sense indicate that your rating should be a tiny bit
>slightly above 2600. If you calculate it by the USCF method it is 2500.
>Horrible.
>
>Now do this:
>Ask, what is the expectancy for a 2300 player?
>2300-2000 => +300 --> 0.85 (IIRC)
>2300-2600 => -300 --> 0.15
>
>1) 0.15
>2) 0.15
>3) 0.15
>4) 0.85
>5) 0.85
>6) 0.85
>   3.00 = total.
>
>So, the expectancy will be 3.0/6.0 points that means that your rating is higher
>than 2300 since you got 4.5.
>Ask the same question for a 2400 player. Nope, it should be higher, what about a
>2700? nope, to high. Iterate until you find the answer. It will be slightly
>higher than 2600, as it should be.
>
>That means doing it game by game, not from the average.
>
>>To do it any other way distorts the statistical significance.
>>
>>>Lots of things can be done.
>>>
>>>>>That is one of the reasons why when I started to play in US, my initial rating
>>>>>was way below the one that I should have had (personally I do not give a damn)
>>>>>because I played tournaments in the area against nobody. That is also the reason
>>>>>why Anatoly Karpov was rated (maybe still is) 2500 in USA. Ridiculous.
>>>>
>>>>You do realize that your rating reflects your results in a rating pool?  Once
>>>>again
>>>>you are using a local rating to compare with ratings from other pools.  It is
>>>>statistically invalid to do this.
>>>
>>>You are assuming, that I compared my elo somewhere else with the elo that I got
>>>in USCF and I was not happy. No, I compared the elo that I got with the elo of
>>>other people who played worse than me here in US. It took me a _long_ time until
>>>that was reversed and still my elo did not reach a balance. Partially, because
>>>it is difficult to increase you elo fast when you play opposition that is weaker
>>>than you.
>>
>>That is what the statistics involved produces.  And it is a _desired_ effect, in
>>fact.
>>Otherwise you could beat nobodys and produce a huge rating.
>>
>>
>>
>>>Besides, if I did the comparison USCF ratings are slightly overrated compared to
>>>FIDE so even if I did, I was not wrong. I was really tired of listening to my
>>>opponents saying: Are you really 2050?
>>>
>>>Karpov 2596? Come on!!! He played the US Amateur and beat a couple of players
>>>with a very low rating and that was the result. Yes, 6 games, but he won all of
>>>them.
>>>http://www.64.com/uscf/ratings/12730227
>>
>>So?  You can't re-write the statistics to produce a result you want for a
>>special
>>case...  I believe that USCF uses a FIDE rating as the initial rating if the
>
>No, it is not a special case. The case I am pointing out it shows the flaw.
>Karpov is not 2596 in '98. What did he do wrong? accepted to play against a
>couple of low rated people that screwed the average.
>Bruce Moreland pointed the flaw in another message. You can really inflate your
>rating if you play against strong people at the beginning, or deflate yours if
>you play only very weak ones. It is enough if you include enough weak to throw
>your average to the bottom.
>
>Miguel

Of course you are right and there may be even improvements.

The system does not use result of other players to change the rating.

Suppose that your opponent win all the games and improve his rating after a game
against you.
I think that it may be logical to use the information to increase your rating
because it is logical to assume that the rating for your opponent was wrong but
the system does not do it.

I do not suggest exactly how to do it and it seems that the problem does not
interest the ICC.


If ICC cares about creating a better rating system
They can give give 10000$ for the people who find the best rating system.

It seems that they do not care so they will not do it.

I have also definition that we can compare based on it different rating systems.
A rating system should give the expected result in every game.
It is easy to use the sum of squares to find the practical error and the rating
system that gives the smallest error is the most logical rating to choose.

Uri

Re: Rating in ICC is meaningless and here is an example Robert Hyatt 17:47:46 01/14/03
- Re: Rating in ICC is meaningless and here is an example Miguel A. Ballicora 08:09:32 01/15/03
  - Re: Rating in ICC is meaningless and here is an example Robert Hyatt 08:51:53 01/15/03
    - Re: Rating in ICC is meaningless and here is an example Miguel A. Ballicora 09:33:05 01/15/03
      - Re: Rating in ICC is meaningless and here is an example Robert Hyatt 09:58:36 01/15/03
    - Re: Rating in ICC is meaningless and here is an example Uri Blass 09:20:14 01/15/03
      - Re: Rating in ICC is meaningless and here is an example Robert Hyatt 10:00:32 01/15/03
        
        Re: Rating in ICC is meaningless and here is an example Uri Blass 13:43:16 01/15/03
        
        Re: Rating in ICC is meaningless and here is an example Robert Hyatt 18:48:44 01/15/03

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.