Author: Rémi Coulom

Date: 12:03:29 12/28/02

Go up one level in this thread

On December 27, 2002 at 09:41:17, Peter Fendrich wrote: > >What to do >---------- >I have a few suggestions that I would like to discuss: > >1) Better utilisation of computer time. If I have time for 20 games it's better >to select 10 players and let A and B meat them respectively. >The meaning of better will be better. My personal use of the statistical test is to measure whether a change in my chess program is an improvement or not, in order to decide whether to keep it or not. Self-play is certainly not accurate in evaluating the difference in playing strength between two close versions of the same program. In particular, it tends to overamplify the effect of small differences. But that is its main interest: it acts as a magnifying glass to observe the effect of a small change in the program. I believe that, given a number of games to play, self-play is more likely to give statistically significant results than playing against a pool of opponents because of this amplification effect (this belief might be worth testing, by the way). Of course, if you obtain statistically significant results against 10 different players then it is certainly much more valuable. Also, note that if you use 10 opponents, you will have 10 games by A and 10 games by B, whereas self-play would have produced 20 games for each player, which, I suppose, would make it easier to reach a better statistical significance. > >2) Use some degree of better, for instance 60% (instead of 50%) as the lower >limit. "A beats B with at least 60%" with a probability of x%. It's hard to tell >anything about probability against the rest of the population but maybe some a >priori distribution can be used. > >In both cases draws has to be counted because they are part of the question. > >Peter Yes, of course, that is a possibility. Unfortunately, the changes I usually make to my chess program are so small that proving >50% probability of win is the best I can hope, most of the time! Rémi

- Re: Proving something is better
**Peter Fendrich***07:27:37 01/02/03*

This page took 0.06 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.