Computer Chess Club Archives

Search

Terms

Messages

Subject: Being better...

Author: Rolf Tueschen

Date: 07:12:00 01/23/04

We just had a little dispute about an old topic. When can we say that a prog is
better than another? How can we proceed to make sound arguments?

Let me tell the story in fast mode.

There was a test. I understand with 300 games or such. An incredibly high number
of games because often we have matches with onl 20 or 40 games.

I understood further that on the base of a confidence intervall of 1-58 we have
95%.

Now what I want to tell you, and this is undisputable statistical standard:

if you get a value that is in the intervall, we cannot conclude that the
difference of the two progs is relevant or valid or call it what you want. It
makes no sense to argue with such "low" differences. They could be still be on
the base of chance. Now the distribution of chance is the Bell curve. Nothing
else.

We had the debate with the SSDF list often enough.

Two progs stand at the top. One is number one in the ranking. But  is it really
stronger than prog number two???

The answer is easy. If the normal variation, this famous +- value in the SSDF
list is say +-40 points and the difference between progs is 35 points THEN we
are unable to conclude anything for sure. It could be that 1 is stronger than 2
but also the contrary could be true. Only from values >40 on we have
"certainty", statistically, that a prog in that specific design is proven
stronger than another one.

This is all so simply and trivial that it is satifying to be able to clarify.

Have fun,

Rolf

P.S.

I just want to correct a heavy mistake in a former posting. There it was said
for Elo differences that the difference of say 1 Elo point would be speaking for
a better strength of one prog over another and you needed so and so many gasmes
to prove that... - - this is total nonsense. There is _no_ way to conclude
anything out of an Elo difference of 1 point, no matter if you have 300 or
100000 games. The difference of 1 Elo point is meaningless. It's nonsense to
even think about such neccessary millions of games to "prove" that. Statistics
also has something to do with normal human sense. We would always take such a
difference for _equal_ strength.

Re: Being better... Dann Corbit 11:46:38 01/23/04
Re: Being better... Bruce Cleaver 11:43:10 01/23/04
- Re: Being better... Kolss 13:14:57 01/23/04
  - Re: Being better... Rolf Tueschen 15:12:19 01/23/04
    - Re: Being better... Kolss 23:24:49 01/23/04
      - Re: Being better... Rolf Tueschen 05:38:40 01/24/04
... and better Igor Gorelikov 08:57:38 01/23/04
- Re: ... and better Rolf Tueschen 09:07:35 01/23/04
Re: Being better... Bob Durrett 08:26:42 01/23/04
- Re: Being better... margolies,marc 12:55:38 01/23/04
  - Re: Being better... Bob Durrett 17:29:33 01/23/04
    - Re: Being better... margolies,marc 20:26:27 01/23/04
      - Re: Being better... Bob Durrett 22:53:47 01/23/04
        
        Re: Being better... margolies,marc 12:59:16 01/24/04
- Re: Being better... Rolf Tueschen 09:00:01 01/23/04
  - Re: Being better... Bob Durrett 09:29:00 01/23/04
    - Re: Being better... Jonas Bylund 11:17:02 01/23/04
      - Re: Being better... Bob Durrett 11:21:16 01/23/04
        
        Re: Being better... Rolf Tueschen 15:15:00 01/23/04
        
        Re: Being better... Bob Durrett 17:40:58 01/23/04
        
        Re: Being better... Rolf Tueschen 18:05:59 01/23/04

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.