Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Chess Tiger 12.0 Performance Rating Update

Author: Stephen A. Boak
Date: 20:14:18 11/17/99
On November 17, 1999 at 17:59:29, Christophe Theron wrote:

>On November 17, 1999 at 03:06:29, Stephen A. Boak wrote:
>
>>On November 16, 1999 at 07:32:57, Christophe Theron wrote:
>>
>>>On November 16, 1999 at 06:49:28, Stephen A. Boak wrote:
>>>
>>>>On November 16, 1999 at 02:16:10, Christophe Theron wrote:
>>>>
>>>>>On November 15, 1999 at 23:01:17, Stephen A. Boak wrote:
>>>>>
>>>>>>Based on the following reported results for Chess Tiger 12.0 at 40/2hr time
>>>>>>controls, and assumed ratings shown for the opponents, I calculate the
>>>>>>Performance Rating for Chess Tiger for 122 games at 2726.
>>>>>>
>>>>>>SSDF Chesstiger12.0 (A450)-Fritz5.32 (P200MMX), 2575, 30,5-13,5
>>>>>>SSDF ChessTiger 12 K6-2 - Shredder 2 200MMX, 2503, 28.5-11.5
>>>>>>ChessTiger12-Fritz6, 2681(?): +5 =1 -0 Thorsten Czub
>>>>>>SSDF Chesstiger12.0 (equal)-Hiarcs 7.32 128MB K6-2 450 MHz, 2646, 19.5-12.5
>>>>>>
>>>>>>--Steve Boak
>>>>>
>>>>>What happens to the rating if you take out the games played by Thorsten? Because
>>>>>there is no evidence that Fritz6 is as strong as Fritz5.32...
>>>>>
>>>>>
>>>>>    Christophe
>>>>
>>>>Without the Thorsten games, ChessTiger 12.0 is 2711 for 116 games.  Yes, I
>>>>assumed Fritz6 was equal (absent evidence) to Fritz5.32.  Dropping those 6 games
>>>>is not very significant since so many games have elsewhere been reported against
>>>>opponents with established ratings.
>>>>
>>>>--Steve
>>>
>>>OK, thanks.
>>>
>>>Do you have a program or a spreadsheet to compute this?
>>>
>>>
>>>    Christophe
>>
>>I use a very simple spreadsheet with a very simple formula.  Can you read an
>>Excel file? or another style spreadsheet?  Let me know which one you can read,
>>and I'll send it to you by eMail.
>
>Yes I can read Excel97 files.
>
>
>
>>The simple formula (it can be very easily calculated by hand or programmed in
>>whatever language you want to use) is:
>>
>> A. Add up all opponent ratings for *all* the games,
>>then
>> B. Calculate (Total ChessTiger 12.0 score - Total Opponent score)*400,
>>then
>> C. Divide the grand sum (A+B) by the total number of rated games.
>>
>>It's that easy.
>>
>>NOTE--for the Step B calculation, you can simply use
>>        (CT12WinQty - CT12LossQty)*400
>>if it is easier to use the Win and Loss counts for CT12, instead of Total Scores
>>for CT12 and its Opponents.  Here is another way of looking at the same
>>calculation and how to 'program' it:
>>
>>[RtgOpp1*QtyGamesVsOpp1 +
>> RtgOpp2*QtyGamesVsOpp2 +
>> RtgOpp3*QtyGamesVsOpp3 + ...
>> RtgOppN*QtyGamesVsOppN +
>> (CT12TotScore - OppTotScore)*400] / TotRatedGames
>>
>>I call this the +/- 400 (plus minus 400) rule for Performance Rating
>>calculation.  It is the same as averaging your individual game performance
>>rating for each game, based on:
>>A. If you win, you get your opponent's rating plus 400 pts.
>>B. If you lose, you get your opponent's rating minus 400 pts.
>>C. If you draw, you get your opponent's rating.
>>
>>This, by the way, is the basic formula used to establish a provisional rating
>>for a new USCF player in the United States.  After approx 20 games or so, the
>>provisional rating becomes a 'permanent' rating, the +/- 400 rule is no longer
>>used, and the USCF then uses a version of ELO rating formulas to alter the
>>'permanent' rating thereafter.
>>
>>The +/- 400 rule is used, however, any time it is desired to calculate the
>>Tournament Performance Rating (TPR) for a player in a particular tournament.
>>The tournament can be USCF or FIDE type, or any similarly rated, ELO-based
>>system.  If I received a TPR of 2850 in a FIDE tournament, that would mean my
>>performance level (score obtained versus my particular opponents and their
>>particular ratings) was about the same as what Kasparov would be expected to
>>obtain, if he played those same opponents instead of me.  :)
>>
>>It might work also, over a large number of games, on another rating system, such
>>as the English or British rating system.  But the magnitude 400 is probably
>>scaled for mathematical ELO-systems like USCF and FIDE.  You often see the TPR
>>ratings shown, for example, in the London Chess Centre's This Week in Chess
>>(TWIC) reports of major tournaments (as a column in the results crosstable).  In
>>this case it is used as a measure of how well a player performed in a single
>>tournament, against a certain field of opponents.
>>
>>In this instance, since CT12 opponents are rated at the SSDF ELO-based system,
>>the Performance Rating I calculated for CT12 may be compared to other comp-comp
>>ratings issued by SSDF.  Thus is approximates the SSDF rating that CT12 has
>>earned so far due to its score versus those particular SSDF-rated opponents.
>>
>>I keep my personal human results, including score and opponent rating, on a
>>spreadsheet.  I use the TPR +/- 400 rule to calculate my TPR, tournament by
>>tournament, as well as my 'running TPR' (which changes game by game) over the
>>last 50, 25 and 15 games.  These TPR figures are easily graphed by my
>>spreadsheet program (Excel)to track my performance trend over time.  I even
>>separately calculate my TPR with White pieces, then my TPR with Black pieces, so
>>I can see my trends there, analyze my strengths and weaknesses, and improve my
>>worst areas.  As you would expect, the TPR with White pieces is normally always
>>higher than the TPR with Black pieces.  It is interesting to calculate how much
>>different those TPR ratings are--if you like statistics like I do!  :)
>>
>>--Steve
>
>
>Thanks a lot for all these explanations. I have printed you message and will
>keep it somewhere for future references.
>
>What you explain sheds some light on things I have heard several times but was
>not able to understand.
>
>I assume the SSDF is using these rules too?
>
>
>    Christophe

I will check on the SSDF site to see how they calculate their ratings--I think I
have read about their method and they use the basic ELO-method, similar to what
USCF and FIDE use.  However, I am not sure.  Maybe a poster from SSDF will reply
before I satisfy myself by some research.

I don't think the +/- 400 TPR rule is used by them (although it is
possible)--after many games, that method will converge with a program's average
ELO-formula rating.  The ELO-method is designed to allow rating movement, up or
down, as a player's performance fluctuates over time.  However, the TPR for a
player (perhaps computer chess program being rated) over all games is the
average ELO-rating of the player for *all* those games.

Example--if a program has a good learning capability, and it improves play over
time versus certain other regular opponents, the TPR method would show a mean
(average) ELO-rating for all such games, whereas the true ELO-formula method
would track the latest rating which would be somewhat higher than the historical
mean ELO-rating that the TPR rating would converge with over time.

And it is possible there are some slight modifications or versions of the basic
ELO-method (the 'K' factor--or how many points, maximum, can be won or lost in a
single game--may vary, as in the USCF system, depending on the rating level of
the opponents).

I have an old book somewhere, something perhaps like 'Rating Systems, Past &
Present' by Arpad Elo (not sure of the exact title).  If you ever can purchase
or borrow a copy of this book, you might be delighted, as I was, to read the
mathematical basis behind the wonderful ELO system he invented to avoid the
shortcomings of many other rating systems.  The mathematical underpinnings are
stunning and very interesting to me.  Knowledge of the statistical basis for how
Elo's rating system works allows a person to more properly interpret ELO-based
ratings (for instance, to compare two players and understand their statistical
chances against each other based on their rating differences) and know how ELO
ratings change with new results.

I will send the spreadsheet(s) soon (day or so), if all goes well (preparing for
a vacation to start very soon, and work is busy--need a bit of bulletin board
relaxation at the moment!).

--Steve
Re: Chess Tiger 12.0 Performance Rating Update Dave Gomboc 22:48:04 11/17/99
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.