Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Number of games to test [100 - 300 games are needed]

Author: Kurt Utzinger

Date: 01:04:52 03/09/02

Go up one level in this thread


On March 09, 2002 at 01:01:02, TEERAPONG TOVIRAT wrote:

>Hi,
>
>How many games I've to test between 2 versions. I think it varies as
>the score ratio. Suppose program A beats program B 4-1.Can I say
>A is superior to B? Or the number is too small?
>
>Thanks for any comment,
>Teerapong

As you can see from the example given below, even a match result over 40 games
of 25-15 does not mean much. To say something concrete you need at least 100
games, better 200-300 games.

<pre>Individual statistics:

(1) A                         :  40 (+ 18,= 14,-  8), 62.5 %

B                             :  40 (+ 18,= 14,-  8), 62.5 %


(2) B                         :  40 (+  8,= 14,- 18), 37.5 %

A                             :  40 (+  8,= 14,- 18), 37.5 %</pre>


<pre>
    Program                            Score     %    Av.Op.  Elo    +   -
Draws

  1 A                              :  25.0/ 40  62.5   2356   2444   98  89
35.0 %
  2 B                              :  15.0/ 40  37.5   2444   2356   89  98
35.0 %</pre>

Generated by Elostat v1.1 from Frank Schubert:
[Start ELO = 2400]


This file contains the most important and central result of the calculations,
the rating list, arranged by Elo performances:

The separate columns give the (mean) Elo performance, the + and - margins of
error given with 95 % confidence, the number of finished games, the relative
score given in percentages, the average opponent Elo and finally, the relative
number of draws for each program.

[From the readme.rtf of Frank Schubert's Elostat v1.1]

Kind regards
Kurt



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.