Author: Dann Corbit
Date: 16:16:09 07/15/00
Go up one level in this thread
On July 15, 2000 at 17:51:55, blass uri wrote: >On July 15, 2000 at 17:32:05, Dann Corbit wrote: > >>On July 15, 2000 at 17:20:18, blass uri wrote: >> >>>On July 15, 2000 at 16:59:32, Mogens Larsen wrote: >>> >>>>On July 15, 2000 at 16:45:19, ShaktiFire wrote: >>>> >>>>>Chris Carson has documented dozens of games at standard time control >>>>>of computer play vs. GMs. >>>>> >>>>>I won't knit pick...this or that program, this or that hardware. >>>>> >>>>>But in the last 2 years, dozens of games have been played. Computers >>>>>vs. GMs at standard time control. >>>>> >>>>>Ratings can be calculated with these games. The more games played, >>>>>the less uncertainty in the rating. The rating indicated, based >>>>>on these dozens of games is over 2500. >>>> >>>>You can't include games from all types of programs on all types of hardware >>>>under different game conditions (tournament, exhibition or something else) and >>>>reach a sound conclusion. Given the number of programs and hardware >>>>configurations, you can't say that computer programs as a single entity are of >>>>GM strength. You need an identical setup, software and hardware, and then >>>>conduct enough games to reduce the uncertainty sufficiently to ensure a >>>>confident rating above 2500. The scientific method is testing using a stable and >>>>unchanged setup. >>> >>>If you have many programs that have performance of more than 2500 you can be >>>sure that the best of them has more than 2500 rating. >>> >>>You can do it without identicl setup,software and hardware. >>> >>>You will never get identical setup of software and hardware in the near future >>>so by your logic you cannot claim that programs are GM level in the near future. >>> >>>I disagree. >> >>I disagree with your disagreement. For each program, they have strengths and >>weaknesses. All programs have bugs in them too. To clump them all together is >>unsound not only mathematically, but for the obvious reason that you don't have >>enough programs from one program to find out how to attack it. >> >>Each program must be decided upon its own merits. Or if we say that "Computer >>programs are GM strengh" then TSCP is a GM. Absurd? Of course. And why not -- >>because we have a lot of games by this program to know better. But if we make a >>few changes to TSCP and make a multithreaded version and put it on a 32 CPU >>alpha it might be a GM. Was the original TSCP on a PII 300 MHz machine now a >>GM? Clearly not. Lumping them together is an act of desparation. Either that >>or a lack of clear thinking. >> >>A program on a given hardware setup may or may not be a GM. You cannot lump >>them all together -- it's simply ridiculous. > >I can say that at least one of them is a GM. > >Imagine that you have 100 different coins and you want to know if they are fair >(probability 1/2 for each side). > >Suppose you throw all of them one time and you get 100 heads(all fall on the >same side). > >I can reject the conjecture that all of them fair with almost 100% confidence >but if I take only one of them I have not enough data to reject the conjecture >that it is fair. > >I know that at least one of them is unfair but I do not know which one. > >The same may be for programs. Did you know that if I flip a perfectly fair coin 100 times, the probability of 100 tails in a row is exactly the same as the other 2^101 -1 possible outcomes? In fact, every possible sequence of those 2^101 -1 is equally probable. This is because the occurance of a head does not alter the probability that the next toss will be a head. It's still 50-50. head-tail repeated 50 times would be just as astonishing. The bigger your collection of experimental points, the GREATER your chances of finding an outlier are. Given enough trials, every absurd observation will be seen. Your conclusion does not follow.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.