Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Upon scientific truth - the nature of information

Author: Robert Hyatt

Date: 14:56:09 07/15/00

Go up one level in this thread


On July 15, 2000 at 17:20:18, blass uri wrote:

>On July 15, 2000 at 16:59:32, Mogens Larsen wrote:
>
>>On July 15, 2000 at 16:45:19, ShaktiFire wrote:
>>
>>>Chris Carson has documented dozens of games at standard time control
>>>of computer play vs. GMs.
>>>
>>>I won't knit pick...this or that program, this or that hardware.
>>>
>>>But in the last 2 years, dozens of games have been played.  Computers
>>>vs. GMs at standard time control.
>>>
>>>Ratings can be calculated with these games.  The more games played,
>>>the less uncertainty in the rating.  The rating indicated, based
>>>on these dozens of games is over 2500.
>>
>>You can't include games from all types of programs on all types of hardware
>>under different game conditions (tournament, exhibition or something else) and
>>reach a sound conclusion. Given the number of programs and hardware
>>configurations, you can't say that computer programs as a single entity are of
>>GM strength. You need an identical setup, software and hardware, and then
>>conduct enough games to reduce the uncertainty sufficiently to ensure a
>>confident rating above 2500. The scientific method is testing using a stable and
>>unchanged setup.
>
>If you have many programs that have performance of more than 2500 you can be
>sure that the best of them has more than 2500 rating.

That is simply unsound logic.  I do ten trials of flipping a coin 10 times.  In
7 of the trials I get more heads than tails.  Does this mean that _one_ of those
seven trials is certainly the _truth_?

What if person A picks the results from ten such trials, and in nearly every
trial he picked, more heads than tails came up.  What if others have run other
trials but didn't publicize their results. And their trials came up mostly
tails?

Small samples from a very large population (different programs, different time
controls, different hardware, etc) does _not_ make good statistical data...
To say that just because several had TPR's > 2500 does not mean that the best
of the group is clearly over 2500.  The only thing you can prove from that
data is that the area of a circle is equal to pi times the radius squared.




>
>You can do it without identicl setup,software and hardware.
>
>You will never get identical setup of software and hardware in the near future
>so by your logic you cannot claim that programs are GM level in the near future.
>
>I disagree.
>
>Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.