Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: About head or tail (was Upon scientific truth - the nature of informati

Author: Christophe Theron

Date: 11:58:32 07/16/00

Go up one level in this thread


On July 16, 2000 at 07:59:44, Alberto Rezza wrote:

>On July 16, 2000 at 03:34:45, Ed Schröder wrote:
>
>>>posted by Dann Corbit on July 15, 2000 at 20:21:54:
>>
>>>Simplifying.  I have a penny.
>>>I toss it twice.
>>>Heads, heads.
>>>I toss it twice
>>>Heads, heads.
>>>I toss it twice
>>>Tails, heads.
>>>I toss it twice
>>>Heads, tails.
>>
>>>I count them up.
>>
>>>Heads are stronger than tails.
>>
>>>My conclusion is faulty.  Why?  Because I did not gather enough data.
>>
>>Right.
>
>Wrong. Perhaps it was the wrong example? Such a weakly defined "conclusion" is
>obviously correct. It's not even necessary to dig out your old statistics book.
>Try testing for P(heads) >= 0.5 + X with confidence 0.5 + Y. Without any
>calculation we can say that for X and/or Y small enough "Heads are stronger than
>tails" is justified.
>
>>But what the crazy result of match-2? Apparently after 300 games it is
>>still not enough to proof that the 10% faster version is superior (of
>>course it is) but the match score indicates both versions are equal
>>which is not true.
>>
>>So how many games are needed to proof that version X is better than Y?
>
>Yes. So the problem is: how much confidence do we need in the chess programs'
>strength? Are 300 games not enough?



It depends upon the elo difference between the programs you are testing. If the
elo difference between prog A and prog B is under 20 points for example, then a
300 games match between A and B is not enough (and it is only at 80%
confidence).




>It seems to me that most people here have set very strict standards for
>computers. If we were to apply such standards to human players, we would have to
>conclude that when a player gets a GM title from FIDE we really cannot say
>whether he is of GM strength; or we might say that we don't have enough games by
>Morphy to tell whether he was stronger than the average club player of his
>time...



Most people did not set such strict standards. The statistics did.

Most people are satisfied with a 10 games match to say that this personality of
ChessMaster is better than this one.

These guys are just doing it for fun, so for this purpose a 10 games match is
enough. There is absolutely no relevance in these matches, but from time to time
it makes 80% of the discussions on CCC. So my conclusion is that people are more
interested in fun an discussions than in relevant information.

And you want my opinion? It's exactly the same about human chess players. And
the same about soccer (european football).




>A program whose results are good for 3 GM norms has GM strength - and that
>should be all.



Yes, and let's decide which program is the computer world champion by playing a
10 rounds tournament. It's only for fun, after all. And this way we will have a
different world champion every time we will play the tournament. Everybody will
be happy, eventually. You don't need to have a super-strong program. You just
need to participate as often as possible.




    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.