Author: Bas Hamstra
Date: 08:07:31 06/11/01
Go up one level in this thread
I think this is how it works: if both programs are equally strong, they you can expect they both would score 50% in a match. However we have noise and luck. Therefore in a 100 game match (still supposing both are equally strong) they outcome will not be exactly 50%, but will lay within a "window". Now you can say something about that window. For instance that it will stretch from [50% - 3Sigma, 50% + 3 * Sigma] 99.8% of the time. Same goes for 2*Sigma, accurate 95% of the time. Sigma is to calculate here as SQR(n * p * q) where n is the sample size, p is A's winning chance (0.50) and q is B's winning chance (also 0.50). So lets take a 100 game match. Sigma would be SQR(100 * 0.50 * 0.50) = 5. Therefore the noise window is [45, 55] with 95% confidence. If the outcome falls OUTSIDE this window, it is fair to say the proggies are not equally strong, in other words, one is better. Bas Hamstra. On June 11, 2001 at 05:08:00, Gian-Carlo Pascutto wrote: >Hi all, > >In line of some of the results posted here, I'd like to >make some people aware of the existance of the 'Whoisbetter' >utility. > >It is a program by Steve Maughan (http://home.clara.net/maughan/) >that can aid you in determining whether a certain result has >any statistical meaning. > >It's a bit unfortunate there is little explication of the used >mathematics :( It only says it is based on the binominal distriubtion. >If anyone has an idea on how it could work, please post, as >I will begin working on a similar program but based on other >mathematics soon. > >A little table for those who can't run Windows programs: >The winning program must at least have the 'required' score >to be able to say it is better with standard statistical >significancy. > >Played games Required score >----------------------------------- > 5 5 - 0 > 10 8 - 2 > 15 11 - 4 > 20 14 - 6 > 30 20 - 10 > 50 31 - 19 > 100 59 - 41 > 200 113 - 89 > >So for example, if you play a match between 2 programs and >one scores 7-3, you _can't_ say that the winner is stronger. > >Well, you can, but there are people that are selling rings that >make you live forever, which are proven to work based on the >same kind of highly reliable science. > >-- >GCP
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.