Author: Marc-Olivier Moisan-Plante
Date: 20:27:14 10/17/05
Go up one level in this thread
Hi Uri, Elostat uses the "non parametric bootstrap" to calculate confidence intervals. If my memrory serves me well (my stats are already a bit far away), this technique is "empirical" in the sense that the confidence intervals are based on the statistical distribution generated by the sample under study. From the experience of testers we know that the performance of one program depens on its opponent. Against "A" Fruit-Uri performs well, against "B" Fruit-Uri performs badly, against "C" Fruit-Uri performs average but a lot of draws and against "D" Fruit-Uri performs average but with very few draws. When Elostat calculates the confidence intervals, it does an (empirical) mixture of the statistical data generated by the 4 probability distributions against A-B-C-D. That means that when you have a few games against a small number of opponents, the error bars are likely to change!! I did the following "experimement" (in fact against "C" and "D" only). Match 1: Fruit-Uri vs Strong +1, =18, -1 for 50%. Calculations by EloStat: Program Elo + - Games Score Av.Op. Draws 1 Fruit-Uri : 2500 48 48 20 50.0 % 2500 90.0 2 Stong : 2500 48 48 20 50.0 % 2500 90.0 Now I add match 2 to the data: Fruit-Uri vs Strange +5, =0, -5 for 50%. Calculations by Elostat on the full sample: Program Elo + - Games Score Av.Op. Draws 1 Fruit-Uri : 2500 80 80 30 50.0 % 2500 60.0 2 Strong : 2500 48 48 20 50.0 % 2500 90.0 3 Strange : 2500 252 252 10 50.0 % 2500 0.0 So you see that the confidence intervals may grow by adding more games if those games are against "do or die" opponents. The confidence intervals after round 1 are correct (theoretically), but they are based on the current sample. With Elostat, when you change the sample the confidence intervals will change, but not necessarly diminish.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.