Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: nullmove and tactics

Author: Pat King

Date: 14:41:07 03/26/04

Go up one level in this thread


On March 24, 2004 at 17:31:35, Dann Corbit wrote:

>On March 24, 2004 at 16:53:08, Uri Blass wrote:
>[snip]
>>The difference is more important and 10-0 is clearly more telling than 19-11
>
>It is stronger, but less reliable.  If it is true, then a dominant change may
>have been found.  But the odds that it is true are not nearly so strong as the
>odds of 19-11 being true.

Gotta go with Uri on this one. I calculated the following table assuming a 95%
confidense factor. N is the number of games, W the number of wins needed to
conclude one engine is better than the other. W is of course always rounded up.

N     W
5     5  (actually 4.3, so 4 out of 5 ain't bad!)
10    8
20    14
30    20
50    31
100   59

So I'm more confident that Uri's 10-0 result is significant than I am about your
19-11 result.

An interesting question is when do you give up? At 19-11 you're very close to
meeting the 95% threshold. Suppose you play 20 more games and find yourself at
30-20. Play 50 more and end up with 58-42.

Another way to look at this is how confident do you need to be? After 100 games,
my above example surely meets the 90% threshold.  Question for the group: If you
were to use a formal statistical method to evaluate your changes, what
confidence level would you want to use? 90%? 95? 99?

I may throw together a quick web page about this, if there's any interest.

Pat King



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.