Author: Stephen A. Boak
Date: 11:46:08 08/26/00
In below threads, there is discussion about comparing performances of various programs using 1) Ranking or average Ranking (for 'teams') or 2) TPR - Tournament Performance Rating or average TPR (for 'teams'). I suggest that using 2) TPR as a measure of which program (or team) is best is inappropriate under many circumstances, including the circumstances of the just concluded WMCC competition where the ratings of the participants varied greatly. Examples: My rating is approx 1900. Let's say I play in a 4-round tournament in an appropriate class section (Under 2000 rating--which typically includes mostly 1800 to 1999 rated players) that happens to several very weak entrants--possibly up and coming young players that wish to get tougher competition to foster their chess development. Example 1: Assume I play the following opponents, and have the following results: Ro Pts GPR (Game Perf Rtg--using +/- 400 rule for TPR calc) 1300 1 1700 1950 0 2350 1975 1 1575 1960 0.5 1960 TPR 1896 In this example, my TPR (1896) is less than my current (starting) rating of 1900, despite the fact that I scored 1.5 / 3 against better players and 1 / 1 against lower rated players--all results better than my expected average scores against such rated opponents. Clearly I performed better than a typical 1900 player would (on the average), yet my TPR is lower than 1900 (my current start rating). Why? The best TPR you can achieve is limited when you play players far below your rating. Even if you beat one of them (above, in the example, I beat a 1300 player, 600 points lower than my rating), that result will 'artificially' lower your average TPR for the tournament. Example 2: In a small local tournament, my rating (1900) is by far the highest among all the remaining, low rated, participants. I play four 1300 players and beat them all, in a 4-round event, my TPR will be 1700 (limited by their low ratings). ELO Systems are better. By USCF rating rules (ELO-based), I will gain a few rating points (approx. 9 points increase) for my performance in the Example 1 tournament. I might gain 1 to 4 points (maximum) in Example 2. CONCLUSION: TPRs are most useful and meaningful for comparative purposes when: 1. A compared program has a well-established rating (not perfect, but based on many prior games and results against rated opponents). 2. Large numbers of games are included in the TPR calculations (say 20 or more). 3. The games included in the TPR calculations are against a wide variety of opponents, rated both above and below your mean (average) rating. With many games, against a wide variety of opponents, both above and below your rating, the possible TPR skewing due to playing an occasional player well above or below your rating is relatively small when averaged in the TPR calculation for many games. ELO formulas, however, take into account both your and your opponent's ratings, in order to determine statistical expectancies for scoring. When you perform better than expected based on starting ratings (score more points than expected) your rating will increase. The opposite will occur when you perform worse than expected. By contrast, TPR used for comparison purposes has its limitations. For a single Swiss pairing tournament, the pairings of the individual programs may greatly differ, depending on which programs score better early in the tournament and which score better late in the tournament. Two programs that tie in final score may have significantly different TPRs--not due to the inherent abilities of the two programs, but due to the random factors involved in the pairings during the entire tournament. The large number of relatively weaker participants in the recent WMCC competition lead to TPRs that are not very useful for comparing performances. --Steve
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.