Computer Chess Club Archives


Search

Terms

Messages

Subject: Weird EloStats

Author: Gian-Carlo Pascutto

Date: 05:33:09 01/13/02

Go up one level in this thread


On January 13, 2002 at 08:17:23, Gian-Carlo Pascutto wrote:

>On January 13, 2002 at 08:07:49, Gian-Carlo Pascutto wrote:
>
>>24 games isn't much, but it's still a huge score difference,
>>enough to be significant. I'll plug the result into elostat
>>and see what comes out.
>
>My mistake, I miscounted. There were only 5 draws and thus 23 games.
>
>  Program                          Elo    +   -   Games   Score   Av.Op.  Draws
>
>  1 Gambit Tiger 2               : 2723  114 182    23    80.4 %   2477   21.7 %
>  2 Gandalf 5                    : 2477  182 114    23    19.6 %   2723   21.7 %

If I read this result correctly, ELOSTAT indicates it's not possible to make any
conclusion about who is stronger since 2723-182 < 2477+182

However, I have a 'whoisbetter' program using trinominal distribution
probabilities that gives this result a confidence level of >99%. I.e. it
is statistically near certain Gambit Tiger 2 is better than Gandalf 5.

It seems that ELOSTAT isn't as sensitive as it could be in rating list mode. If
I put it in single competition mode, it does give a significant result.
(Unfortunately it doesn't output it to a file and I can't copy/paste it either)

--
GCP



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.