Author: James T. Walker
Date: 04:33:46 01/21/03
Go up one level in this thread
On January 20, 2003 at 21:15:53, Dann Corbit wrote: >On January 20, 2003 at 18:41:02, Sune Fischer wrote: > >>On January 20, 2003 at 18:34:21, Dann Corbit wrote: >> >>>On January 20, 2003 at 18:08:45, Sune Fischer wrote: >>> >>>>On January 20, 2003 at 17:27:44, Dann Corbit wrote: >>>> >>>>>>>No contest can truly tell us which program is strongest. Not even a trillion >>>>>>>rounds of round-robin. >>>>>>I disagree again. I believe a trillion rounds will show which program is >>>>>>strongest. >>>>> >>>>>You're wrong. >>>>> >>>> >>>>No he is right. >>>>There is a saying in statistics (IIRC correctly) "null events don't happen". >>>> >>>>Basicly it means things that are very very improbable are impossible. >>>> >>>>You would never see TSCP beat Fritz more than 50% of the time if you did a >>>>trillion games. No one has done more than a trillion games yet, we all know >>>>fritz is stronger, why is that? ;) >>> >>>Until the number of games reaches infinity, there will always be uncertainty. >>> >>>Because there is some degree of randomness in the programs, I'm not even sure >>>that there *is* an answer to the question: >>>"Which is stronger, Chess Tiger or Fritz?" >>> >>>For programs with hundreds of ELO difference, you can be fairly certain >>>relatively quickly. For programs of about the same strength, you will never >>>know the answer. >> >>But what you were saying was, that you could _never ever_ know the answer. There >>is a fundamental difference I think and this is where the null event theorem >>saves us. It _is_ possible to make an accurate statement if you have reduced it >>to a null event. After 1 trillion games I think we have a clear winner, whom >>ever that may be. > >I would be utterly astonished if it were true. > >After a trillion coin flips, we will still have random walk problems, and it >could (on rare occasions) be considerable. How will we discern the random walk >drift from a very tiny change in strength? At the top, the strength of the >programs appears to be very close. This is the exact region where random walk >will give us the most trouble. > >In other words, I think we will not be able to discern (on a blind test) whether >we pit top program A against top program B or whether it was A against A or B >against B. Playing Fritz 8 vs Chess Tiger 15 or something similiar is not equal to a coin toss. You are purposely distorting the issue with false analogies to try to prove a not so valid point. For instance a coin toss would be more like playing Fritz 8 vs Fritz 8.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.