Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: 6 game 40/2 COMP WINS just as i predicted!

Author: Robin Smith

Date: 22:39:49 01/12/01

Go up one level in this thread


On January 12, 2001 at 15:17:00, Dann Corbit wrote:

>The proof of a court is similar to what I am seeking.  Proof beyond a reasonable
>doubt.  Right now, there is a reasonable doubt.

I agree with you.  But other people have a different idea of what constitutes
reasonable doubt.  And that they do seems reasonable to me.  After all, no one
is going to jail over this.  Perhaps you would be more comfortable if they said
"it has been demonstrated" instead of "it has been proven".  I think many people
use the words more or less interchangably in this way.  And that is one
reasonable way to use the words.  Mathematics does not have a monopoly on the
word proof.

>>Also, if you want 100% mathematical certainty, a mathematical/statistically
>>based argument can NEVER "prove" that someone is of "GM strength", even if they
>>win 100% of the games they play against GM's and they play hundreds of games.
>
>We will never achieve 100% certainty, because it is impossible.  But we have far
>less than 50% certainty.  Does that sound like something that is proven to you?
>Less than a coin toss?

If you are only talking about the Rebel match, OK.  But there have been other
programs holding their own with GM's as well.  I would guess there is less than
50% uncertainty, but I haven't done the math.  Have you done the math?

>>The uncertainty in the claim that they are GM strength becomes very, very small
>>.... but it never goes to zero.  Mathematically there will always be a small,
>>but non-zero, chance that they were very, very "lucky".
>
>Agreed.  I'm not as unreasonable as you might imagine.  (Well, most of the time
>I am not).  I am not looking for 100% certainty.  I am looking for reasonable
>certainty.  I also reject the helter-skelter conditions of the experiments.
>Without adequate controls and repeatability, experiments are worthless.

Without adequate controls and repeatability, experiments are less worthwhile
than if the are adequate controls and repeatability, but that does not make them
"worthless".  If that made them worthless we would not be able to say Kasparov
is a GM, because of lack of adequate controls.  This is silly.

>>Right now we can already say, statistically, Rebel IS GM strength.  It is just
>>the uncertainty in the validity of that statement that is still quite large.  As
>>more games are played and the uncertainty goes down, we will have more
>>confidence that the assertion is true (or not, as subsequent data suggests).
>>There is no magic number of games at which point we have suddenly "proved
>>mathematically" that computers are (or are not) GM strength.
>
>Right now, the uncertainty is so large that it is clearly not proven.  It is
>probably true.  It simply has not been proven to be so.

At what point is it "proven"?  One sigma?  Two?  Three?  And how do you factor
in the lack of controls you pointed out above?  The boundry between proven and
unproven is not so precisely defined as you seem to imply.

>I don't think it is even proven beyond a reasonable doubt.  In fact, I don't
>think it is even proven to the level of a civil case (where the preponderance of
>the evidence says it is so).  But that one is iffy, I will admit.  I have
>pointed out many flaws in the model, and suggested reasonable assertions that
>demonstrate how the experients are flawed and how the ratings could change
>dramatically.

I agree there are many flaws.  No one has done a controlled experiment.  But I
don't think we should wait for a controlled experiment, becuase if we wait for
that we may have to conclude that we don't even know if programs are over 2000
strength yet.

>All that having been said, they are very likely GM's.  But it will be proven
>when it has been proven.  Right now it isn't.

"Proven when it has been proven", that is an intersting statement.  It makes it
all sound so definate, precise and conclusive.  But if it is so precise, what is
the definition of proven?  The whole thing is actually quite silly, because it
takes something that is inheritantly probabilistic and tries to make it black or
white .... proven or unproven.  Mathematically this isn't so simple as you seem
to imply.

Robin



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.