Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Being better...

Author: Kolss

Date: 13:14:57 01/23/04

Go up one level in this thread


On January 23, 2004 at 14:43:10, Bruce Cleaver wrote:

>The real weakness here is accepting the 95% certainty claim.  It's traditional,
>and makes some kind of sense, but when you get down to it there is also an
>element of arbitrariness.  If you are satisfied with a 90% certainty, or perhaps
>+/- 1.5 sigma, or whatever, the scores that will impress you (out of 300 games)
>change.
>
>There are also times when 95% isn't good enough, and the claim has to be much
>tighter.

I completely agree.

95% is arbitrary; take 68% or 82.73% if that makes you happy. Only you should
know what confidence interval you employ and what it tells you. Keep in mind
that you can always only say something like "A is better than B" with a certain
(error) probability / confidence - it is entirely up to you what probability is
good enough for you (nothing magical about the 95%).

Just to repeat the previous case (S8 vs. S7.04 162.5-137.5, +90 -65 =145): the
probability that S8 is the better of the two programs, based on this single
match and therefore only valid for a direct match between the two in this
particular setup etc., is about 97.5% (it will in fact even be slightly higher,
maybe 98%). If you say you want 99% certainty before you believe it, that is
fine - you may have to run a 1000-game match then after all. If you "believe" in
the 95% and want to show that one program is better than the other, then you can
stop. If you want 100% certainty, you will have to play an infinite number of
games.

Best regards - Munjong.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.