Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Russek -Rebel Match, Game 2

Author: James T. Walker

Date: 12:18:44 01/02/00

Go up one level in this thread


On January 01, 2000 at 18:58:56, Stephen A. Boak wrote:

>>On January 01, 2000 at 10:20:48, James T. Walker wrote:
>
><snip>
>
>>...I think if you take all aspects of the game into account you will
>>find it's (60 Points) about right vs humans too.  This may diminish as ratings
>>get higher because most ratings systems use different "K" factor to calculate
>>the ratings.  The ICC ratings are inflated I think partly because they use K=32
>>for all players while the USCF uses K=8 for 2400 and above.  The lower K factor
>>causes more stability in the ratings and less inflation at the higher levels.
>>This may be what is causing the 60 points to diverge at higher ratings and to be
>>even more at lower ratings.  This is all my theory of course.
>>Seasons Greetings,
>>Jim Walker
>
>Hi Jim,
>  In an ELO-based system, rating lags performance--always.
>  A newly calculated rating takes into account not only the player's recent
>games, but also the player's performance over many prior games (as embodied in
>the last calculated ELO rating of the player, i.e. Starting Rating).
>  A difference in ELO ratings leads directly (due to the math behind the ELO
>rating system) to a statistically valid expectation for performance of one
>player versus another, or one player versus a particular field of opponents.
>  The statistical basis for ELO-ratings takes account of (assumes) natural
>variation--the fact that a player's results (versus expectation) will vary
>randomly, event after event.  The ELO system assigns ratings that statistically
>lead (through math calculations) to the average expected result versus any
>particular opponent or field of opponents with known ratings.  Otherwise your
>rating would simply reflect the results of your latest play.
>  The ELO system also measures the growth or decline in playing strength of a
>player over time.  Indeed one cannot know if a player's strength has grown or
>declined except by measuring it *over time*.
>  The K-factor has very little to do with inflation or deflation in general and
>nothing to do with the relative comparison of human strength vs computer
>strength.
>  It is a factor that helps all players more quickly rise or fall in rating
>(although there is always a lag, even among the lower rated players for which
>the K-factor is higher in the USCF), according to his relatively current playing
>form.  More properly stated, it is a mathematical factor that restricts or aids
>the speed at which ratings change.
>  In general, over time and many games of play, a human has achieved his proper
>rating, relative to the other players in the overall rating pool.  This is an
>underlying assumption of the ELO system.
>  Some particular humans may be deemed overrated (rating is inflated), some may
>be deemed underrated (rating is deflated), but by and large the overall group is
>rated properly on the average.
>  The fact that ratings at high levels in the USCF are governed by a smaller
>K-factor and therefore do not change as rapidly as for a lower rated player does
>not hinder a strong player's rating from rising or falling, over time, to its
>proper level.  It simply governs how quickly (how many games it will take, i.e.
>how long the lag will be) for a rating to reflect true strength, assuming the
>player has risen or dropped in true strength (perhaps by studying/playing a lot
>versus being inactive in both aspects or through aging--whatever the reason), or
>is a new player for whom a measure of strength must be established--over time.
>  Consider that J. Polgar, after losing to A. Shirov by 5.5 to 0.5 score might
>dropped several hundred rating points 'instantly' if her new rating thereafter
>was based soly on her particular (latest) match at that time; that is if her new
>rating was not based on 1) her prior rating based on prior games, and 2) the
>K-factor (whatever it might be in the FIDE ELO system), as well as 3) her recent
>performance.
>  Note a similar observation holds for Shirov, who might otherwise have vaulted
>'instantly' to a rating much higher than Kasparov!
>  Polgar simply had a natural variation in results (versus expectancy) and did
>much worse than her 'average' expectation for scoring based on the starting FIDE
>ratings for the players.  Her true strength is assumedly still in the 2600's,
>despite a single bad match or tournament result.  By the same token, Shirov
>simply had a natural variation in results (versus expectancy) and did much
>better than his 'average' expectation for scoring based on the starting FIDE
>ratings for the players.
>  The K-factor used in ELO rating changes not only prevents relatively more
>rapid gain in rating after a good performance, but it also protects against more
>rapid loss in rating, after a poor performance (below ELO expectation).
>  I don't call it a factor to regulate inflation/deflation, but merely a factor
>to regulate the relative weighting of a players prior performance (embodied in
>the starting rating for a rated period of reported games) versus their recent
>performance in the new games submitted for rating in the rating period.
>  Don't forget, the K-factor works both ways--to limit the speed of gain in
>rating points, and to limit the speed of loss of rating points.  That is
>evenhanded in general, neither favoring inflation or deflation of a pool of
>players in the ELO based rating system.
>  You might find the book written by Arpad Elo on his rating system to be very
>informative and a big help to understand the mathematics and statistical
>foundation intentionally devised by Elo as underpinnings to his rating system.
>  I do not, nor did Elo claim that his rating system was 'perfect' for measuring
>true playing strength of any particular player.  Quite the opposite, so to
>speak.  He intentionally took into account the known statistical fact that
>natural variation occurs in the all measured processes, including play of rated
>human players.  He also intentionally took into account that player strength may
>increase or decline over time.
>  The K-factor is a necessary factor (no particular value for the K-factor is
>necessary, however) to establish mathematically by how much performance will
>lead rating, or rating will lag performance.  This balances a new rating
>calculation somewhere between prior rating and current performance (example,
>recent TPR).  A new rating never jumps to equal latest TPR nor to exceed latest
>TPR--it only changes a portion of the distance to latest TPR and approaches it.
>  Yes, it helps establish some kind of stability, more or less, for ratings in a
>pool of rated players that experience both natural variation about a mean
>expectation about last ELO rating, and growth or decline in strength over time.
>It doesn't favor or disfavor ratings inflation in general, at any particular
>level, even in USCF rating system that uses 3 different K-factors.  The ELO
>system assumes that a player's playing strength may be measured in some manner,
>over time, and thus assumes some inherent stability in true playing strength,
>for most players in a rating pool--they are neither growing rapidly in strength
>nor falling rapidly in strength.
>  ELO carefully treats the situation of new players entering a rating pool, to
>establish their starting ratings without undue inflation or deflation.
>  Take care,
>    --Steve Boak


Hello Steve,
Thanks for your explanation of the ELO system.  My statement was that the K
factor was "Partly"to blame for the ICC rating inflation. The reason I say this
is because when K=32 and two players are equal in strength they are playing for
16 points in the case of a win/loss.  This means if I have a "Hot streak" and
win 4 or 5 games in a row vs players of the same rating as me I can raise my
rating almost 100 points in a very short time.  Of course the opposite is also
true but temporarily my rating is inflated.   In the same scenario if K=8 then I
would only increase my rating by 20-25 points and would never reach that
glorious plateau 100 points higher than I really am.  On a really good day I
could pick my opponents cleverly and gain even more.  So playing for more points
may cause temporary inflation and does cause volatility in the ratings.  I will
reach highs I might ordinarily never reach.  Never mind the lows because they
will even out.  The inflation is a fact on ICC and this is only a small reason.
Jim Walker



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.