Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Testmethods for n=0, n=1 and n=>800 - For Beginners and 'old Hands'

Author: Joachim Rang

Date: 07:39:48 09/13/02

Go up one level in this thread


On September 13, 2002 at 09:20:26, Rolf Tueschen wrote:


>Take a 100 m final in athletics. Now either someone is visibly faster then he's
>the best. The moment you can't decide with your own eyes who's the winner, there
>is no winner at all no matter how many digits you are defining. As humans we
>don't take the one runner with two nano seconds less as the "best"! We say
>simply that they are equally strong. And that should be remembered in CC too. If
>you get a result of 52-48 then the two progs are equally strong. And no voodoo
>with statistics could bring more clarity. And 720 to 680 is - in chess with
>computers - also almost equally strong. You can't get automatically "better"
>results in CC with simply raising the n. Why? Because the whole thing with
>statistics is the underlying distribution. Strength should be a normal
>distribution, but it isn't in CC. In CC almost all depends on hardware. The rest
>is so minimal that you can't detect it statistically.
>(Another important aspect is the Law of the Constance of the variables exception
>the one you want to measure. But I don't want to confuse too much.)
>
>Rolf Tueschen

I disagree:

If you got a result 52-48 you can't say, which engine is better, but if you got
a result 5200-4800 you can at least with 99% probability say, that program A
performs better against program B (which doesn't mean, that program A performs
better than B against other programs).



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.