Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: "Battle of the Crowns" Contest status:

Author: blass uri
Date: 15:37:18 07/28/00
On July 28, 2000 at 18:14:13, Dann Corbit wrote:

>On July 28, 2000 at 15:58:53, blass uri wrote:
>
>>On July 28, 2000 at 15:33:47, Dann Corbit wrote:
>><snipped>
>>>    Program          Elo    +   -   Games   Score   Av.Op.  Draws
>>>
>>>  1 LarsenVB       : 2610  186 226    12    79.2 %   2378   25.0 %
>>>  2 Storm          : 2557  223 166    12    58.3 %   2498   33.3 %
>>>  3 Noonian        : 2546  232 144    12    54.2 %   2517   41.7 %
>>>  4 Ozwald         : 2542  166 223    12    41.7 %   2601   33.3 %
>>>  5 Monik          : 2363  214 247    12    62.5 %   2274    8.3 %
>>>  6 Zephyr         : 2317  215 215    12    50.0 %   2317   16.7 %
>>>  7 TSCP           : 2293  180 402    12    83.3 %   2013    0.0 %
>>>  8 SnailSCP       : 2185  214 194    12    62.5 %   2096   25.0 %
>>>  9 Raffaela       : 1893  297 170    12     8.3 %   2310   16.7 %
>>> 10 Golem01        : 1695    0   0    12     0.0 %   2295    0.0 %
>>
>>The elo is simply wrong.
>
>The ELO is approximate, and correct within the stated error bars.
>
>>The right way to calculate elo based on tournament is simply to assume that the
>>tournament happen again and again and calculate the limit of the elo rating for
>>every program when the number of games get closer to infinite
>
>Naturally, this is the process that was used.  About 100 iterations, if I recall
>correctly, is what was used to calculate the table.

I guess that 100 iterations is not enough and I guess that after 100000
iterations TSCP is going to have better rating than Monik(it is clear that the
rating of Monik cannot be better than TSCP after enough iterations).

calculating 100000 iterations is not a problem for the computer.
>
>>(you should not
>>include programs that has 0% or 100%).
>
>What is your scientific reason for exclusion of real data?  It is just as valid
>to win or lose all of your games as it is to win only a fraction of them.

The problem is that if one program has -infinite rating (0% is the result of
 -infinite rating if the number of iteration is infinite) the other programs
have +infinite rating and we cannot have order between the other programs.
>
>>In this case you should not count Golem01 because the elo of this program will
>>always go down(the expected result of golem is more than 0% even if there is a
>>difference of 1000000 elo).
>
>Golem will eventually win or draw.  I expect Golem01 verses Raffaela to be about
>even.

When it happens you can include Golem in the rating assuming that it is
impossible to divide the programs to 2 classes when one class got 100% against
the other class.


>
>>TSCP also deserves at least the same rating as Monik because TSCP got 50%
>>against Monik and 100% against other players so the rating of TSCP should always
>>improve when it is behind Monik when the results repeat again.
>
>TSCP has played weaker opponents than Monik, according to the calculations.
>Eventually, the error bars will reduce and I expect that in the end, each ELO
>positional value will agree with the ordinary contest ranking positional value
>(points scored, and tie-breaks).

It is clear that in the end the elo will be in the same order as the ranking but
when you have only part of the data the calculation of the rating is wrong.

Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.