Computer Chess Club Archives

Search

Terms

Messages

Subject: About Elostat: confidence intervals

Author: Marc-Olivier Moisan-Plante

Date: 20:27:14 10/17/05

Hi Uri,

Elostat uses the "non parametric bootstrap" to calculate confidence intervals.
If my memrory serves me well (my stats are already a bit far away), this
technique is "empirical" in the sense that the confidence intervals are based on
the statistical distribution generated by the sample under study.

From the experience of testers we know that the performance of one program
depens on its opponent. Against "A" Fruit-Uri performs well, against "B"
Fruit-Uri performs badly, against "C" Fruit-Uri performs average but a lot of
draws and against "D" Fruit-Uri performs average but with very few draws.

When Elostat calculates the confidence intervals, it does an (empirical) mixture
of the statistical data generated by the 4 probability distributions against
A-B-C-D.

That means that when you have a few games against a small number of opponents,
the error bars are likely to change!!

I did the following "experimement" (in fact against "C" and "D" only).

Match 1: Fruit-Uri vs Strong +1, =18, -1 for 50%. Calculations by EloStat:

   Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Fruit-Uri                      : 2500   48  48    20    50.0 %   2500   90.0
  2 Stong                          : 2500   48  48    20    50.0 %   2500   90.0

Now I add match 2 to the data:
Fruit-Uri vs Strange +5, =0, -5 for 50%. Calculations by Elostat on the full
sample:

  Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Fruit-Uri                      : 2500   80  80    30    50.0 %   2500   60.0
  2 Strong                         : 2500   48  48    20    50.0 %   2500   90.0
  3 Strange                        : 2500  252 252    10    50.0 %   2500    0.0

So you see that the confidence intervals may grow by adding more games if those
games are against "do or die" opponents.

The confidence intervals after round 1 are correct (theoretically), but they are
based on the current sample. With Elostat, when you change the sample the
confidence intervals will change, but not necessarly diminish.

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.