Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Testing a newer version a program against a previous is misleading

Author: Russell Reagan

Date: 16:39:59 10/16/03

Go up one level in this thread


On October 16, 2003 at 17:23:21, George Tsavdaris wrote:

> There is no such thing as statistical reliability. For that we have to play an
>infinite number of games. Of course we can define a number like 0.95 or 0.99,
>for the statistical reliability a result must reach, to be satisfied and say
>that engine A is stronger than B.
> So every tournament is not wothless but only less reliable from another.

That's what I meant. The tournaments of a few dozen rounds are rarely
statistically reliable to 95% or higher.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.