Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Being better...

Author: Rolf Tueschen

Date: 05:38:40 01/24/04

Go up one level in this thread


On January 24, 2004 at 02:24:49, Kolss wrote:

>On January 23, 2004 at 18:12:19, Rolf Tueschen wrote:
>
>>On January 23, 2004 at 16:14:57, Kolss wrote:
>>
>>>On January 23, 2004 at 14:43:10, Bruce Cleaver wrote:
>>>
>>>>The real weakness here is accepting the 95% certainty claim.  It's traditional,
>>>>and makes some kind of sense, but when you get down to it there is also an
>>>>element of arbitrariness.  If you are satisfied with a 90% certainty, or perhaps
>>>>+/- 1.5 sigma, or whatever, the scores that will impress you (out of 300 games)
>>>>change.
>>>>
>>>>There are also times when 95% isn't good enough, and the claim has to be much
>>>>tighter.
>>>
>>>I completely agree.
>>>
>>>95% is arbitrary; take 68% or 82.73% if that makes you happy. Only you should
>>>know what confidence interval you employ and what it tells you. Keep in mind
>>>that you can always only say something like "A is better than B" with a certain
>>>(error) probability / confidence - it is entirely up to you what probability is
>>>good enough for you (nothing magical about the 95%).
>>>
>>>Just to repeat the previous case (S8 vs. S7.04 162.5-137.5, +90 -65 =145): the
>>>probability that S8 is the better of the two programs, based on this single
>>>match and therefore only valid for a direct match between the two in this
>>>particular setup etc., is about 97.5% (it will in fact even be slightly higher,
>>>maybe 98%). If you say you want 99% certainty before you believe it, that is
>>>fine - you may have to run a 1000-game match then after all. If you "believe" in
>>>the 95% and want to show that one program is better than the other, then you can
>>>stop. If you want 100% certainty, you will have to play an infinite number of
>>>games.
>>>
>>>Best regards - Munjong.
>>
>>
>>Munjong,
>>you can repeat it as much as you want but it remains wrong!
>>With your data you can NOT say that Shredder8 is now better with 97,5% or
>>whatever "probability". That is nonsense. Period.
>>
>>Rolf
>
>Rolf,
>
>At least my assertions are based on some theory - namely statistical theory.
>I suppose you would even deny that one can say that Shredder8 is better with at
>least 50% probability...
>
>Munjong.


Sure. But you mean +50% or -50%. This is decisive to me...

Rolf



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.