Author: Alberto Rezza
Date: 01:54:16 12/16/03
Go up one level in this thread
On December 15, 2003 at 01:25:40, Christophe Theron wrote: >I think you are not going to like the answer. :) > >It depends on: >* the reliability you want (do you want a 70% reliability? 80%? 90%? 95%?) >* the elo difference between the programs > >If you want a very good reliability in the result (for example 95%) and the two >programs are very close in elo, then you might need several thousands games. There is a fatal flaw in your argument: if you already know the Elo difference, then there is no point at all in playing a test match... You should have made it depend on: * the reliability you want * the actual result of the match (N points out of M games) and from this you can draw conclusions like "Elo difference >50 points with 95% confidence" or (if you are a much better statistician than I am) even compute a continuous distribution of probability for the Elo difference. In other words: 5 games may be too few, but there is a big difference between 16-14 and 30-0. If you get a 30-0 result, the hint is: maybe the programs are NOT very close in Elo, so you do NOT need thousands of games.. :) Alberto BTW: if you open Whoisbetter, the default params are 95% and 0 points for the loser; and the minimum number of games is given as 5 (result: 5-0) for 95% confidence. What a coincidence :)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.