Author: James T. Walker
Date: 12:18:44 01/02/00
Go up one level in this thread
On January 01, 2000 at 18:58:56, Stephen A. Boak wrote: >>On January 01, 2000 at 10:20:48, James T. Walker wrote: > ><snip> > >>...I think if you take all aspects of the game into account you will >>find it's (60 Points) about right vs humans too. This may diminish as ratings >>get higher because most ratings systems use different "K" factor to calculate >>the ratings. The ICC ratings are inflated I think partly because they use K=32 >>for all players while the USCF uses K=8 for 2400 and above. The lower K factor >>causes more stability in the ratings and less inflation at the higher levels. >>This may be what is causing the 60 points to diverge at higher ratings and to be >>even more at lower ratings. This is all my theory of course. >>Seasons Greetings, >>Jim Walker > >Hi Jim, > In an ELO-based system, rating lags performance--always. > A newly calculated rating takes into account not only the player's recent >games, but also the player's performance over many prior games (as embodied in >the last calculated ELO rating of the player, i.e. Starting Rating). > A difference in ELO ratings leads directly (due to the math behind the ELO >rating system) to a statistically valid expectation for performance of one >player versus another, or one player versus a particular field of opponents. > The statistical basis for ELO-ratings takes account of (assumes) natural >variation--the fact that a player's results (versus expectation) will vary >randomly, event after event. The ELO system assigns ratings that statistically >lead (through math calculations) to the average expected result versus any >particular opponent or field of opponents with known ratings. Otherwise your >rating would simply reflect the results of your latest play. > The ELO system also measures the growth or decline in playing strength of a >player over time. Indeed one cannot know if a player's strength has grown or >declined except by measuring it *over time*. > The K-factor has very little to do with inflation or deflation in general and >nothing to do with the relative comparison of human strength vs computer >strength. > It is a factor that helps all players more quickly rise or fall in rating >(although there is always a lag, even among the lower rated players for which >the K-factor is higher in the USCF), according to his relatively current playing >form. More properly stated, it is a mathematical factor that restricts or aids >the speed at which ratings change. > In general, over time and many games of play, a human has achieved his proper >rating, relative to the other players in the overall rating pool. This is an >underlying assumption of the ELO system. > Some particular humans may be deemed overrated (rating is inflated), some may >be deemed underrated (rating is deflated), but by and large the overall group is >rated properly on the average. > The fact that ratings at high levels in the USCF are governed by a smaller >K-factor and therefore do not change as rapidly as for a lower rated player does >not hinder a strong player's rating from rising or falling, over time, to its >proper level. It simply governs how quickly (how many games it will take, i.e. >how long the lag will be) for a rating to reflect true strength, assuming the >player has risen or dropped in true strength (perhaps by studying/playing a lot >versus being inactive in both aspects or through aging--whatever the reason), or >is a new player for whom a measure of strength must be established--over time. > Consider that J. Polgar, after losing to A. Shirov by 5.5 to 0.5 score might >dropped several hundred rating points 'instantly' if her new rating thereafter >was based soly on her particular (latest) match at that time; that is if her new >rating was not based on 1) her prior rating based on prior games, and 2) the >K-factor (whatever it might be in the FIDE ELO system), as well as 3) her recent >performance. > Note a similar observation holds for Shirov, who might otherwise have vaulted >'instantly' to a rating much higher than Kasparov! > Polgar simply had a natural variation in results (versus expectancy) and did >much worse than her 'average' expectation for scoring based on the starting FIDE >ratings for the players. Her true strength is assumedly still in the 2600's, >despite a single bad match or tournament result. By the same token, Shirov >simply had a natural variation in results (versus expectancy) and did much >better than his 'average' expectation for scoring based on the starting FIDE >ratings for the players. > The K-factor used in ELO rating changes not only prevents relatively more >rapid gain in rating after a good performance, but it also protects against more >rapid loss in rating, after a poor performance (below ELO expectation). > I don't call it a factor to regulate inflation/deflation, but merely a factor >to regulate the relative weighting of a players prior performance (embodied in >the starting rating for a rated period of reported games) versus their recent >performance in the new games submitted for rating in the rating period. > Don't forget, the K-factor works both ways--to limit the speed of gain in >rating points, and to limit the speed of loss of rating points. That is >evenhanded in general, neither favoring inflation or deflation of a pool of >players in the ELO based rating system. > You might find the book written by Arpad Elo on his rating system to be very >informative and a big help to understand the mathematics and statistical >foundation intentionally devised by Elo as underpinnings to his rating system. > I do not, nor did Elo claim that his rating system was 'perfect' for measuring >true playing strength of any particular player. Quite the opposite, so to >speak. He intentionally took into account the known statistical fact that >natural variation occurs in the all measured processes, including play of rated >human players. He also intentionally took into account that player strength may >increase or decline over time. > The K-factor is a necessary factor (no particular value for the K-factor is >necessary, however) to establish mathematically by how much performance will >lead rating, or rating will lag performance. This balances a new rating >calculation somewhere between prior rating and current performance (example, >recent TPR). A new rating never jumps to equal latest TPR nor to exceed latest >TPR--it only changes a portion of the distance to latest TPR and approaches it. > Yes, it helps establish some kind of stability, more or less, for ratings in a >pool of rated players that experience both natural variation about a mean >expectation about last ELO rating, and growth or decline in strength over time. >It doesn't favor or disfavor ratings inflation in general, at any particular >level, even in USCF rating system that uses 3 different K-factors. The ELO >system assumes that a player's playing strength may be measured in some manner, >over time, and thus assumes some inherent stability in true playing strength, >for most players in a rating pool--they are neither growing rapidly in strength >nor falling rapidly in strength. > ELO carefully treats the situation of new players entering a rating pool, to >establish their starting ratings without undue inflation or deflation. > Take care, > --Steve Boak Hello Steve, Thanks for your explanation of the ELO system. My statement was that the K factor was "Partly"to blame for the ICC rating inflation. The reason I say this is because when K=32 and two players are equal in strength they are playing for 16 points in the case of a win/loss. This means if I have a "Hot streak" and win 4 or 5 games in a row vs players of the same rating as me I can raise my rating almost 100 points in a very short time. Of course the opposite is also true but temporarily my rating is inflated. In the same scenario if K=8 then I would only increase my rating by 20-25 points and would never reach that glorious plateau 100 points higher than I really am. On a really good day I could pick my opponents cleverly and gain even more. So playing for more points may cause temporary inflation and does cause volatility in the ratings. I will reach highs I might ordinarily never reach. Never mind the lows because they will even out. The inflation is a fact on ICC and this is only a small reason. Jim Walker
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.