Author: Gian-Carlo Pascutto
Date: 05:33:09 01/13/02
Go up one level in this thread
On January 13, 2002 at 08:17:23, Gian-Carlo Pascutto wrote: >On January 13, 2002 at 08:07:49, Gian-Carlo Pascutto wrote: > >>24 games isn't much, but it's still a huge score difference, >>enough to be significant. I'll plug the result into elostat >>and see what comes out. > >My mistake, I miscounted. There were only 5 draws and thus 23 games. > > Program Elo + - Games Score Av.Op. Draws > > 1 Gambit Tiger 2 : 2723 114 182 23 80.4 % 2477 21.7 % > 2 Gandalf 5 : 2477 182 114 23 19.6 % 2723 21.7 % If I read this result correctly, ELOSTAT indicates it's not possible to make any conclusion about who is stronger since 2723-182 < 2477+182 However, I have a 'whoisbetter' program using trinominal distribution probabilities that gives this result a confidence level of >99%. I.e. it is statistically near certain Gambit Tiger 2 is better than Gandalf 5. It seems that ELOSTAT isn't as sensitive as it could be in rating list mode. If I put it in single competition mode, it does give a significant result. (Unfortunately it doesn't output it to a file and I can't copy/paste it either) -- GCP
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.