Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Statistics and Test results

Author: Rick Bischoff

Date: 10:53:23 10/07/04

Go up one level in this thread


On October 07, 2004 at 11:38:41, Chris Welty wrote:

>>However, since the sample isn't random, the entire test is meaningless.
>
>What makes you think the sample isn't random?

It is not random since you did not say it was.   Playing 30 games in a row does
not count as a random sample.

>>If you could test engines like that, you would use the binomial
>>distribution and would need more than 30 random games from those engines to
>>properly test the probability of one engine winning over another.
>
>That's just wrong. The number of games you need is dependent on the results of
>the games.

No, that is just wrong.  To properly test a hypothesis, you do not say "Ok,
looks good to me."  You set your criteria before hand.  i.e., do you want to be
95% confident that you have the right answer? 99% confident?  Then, you decide
on the proper test to use and the sample size.  You do not simply quit formal
testing because the results look good.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.