Author: Chris Welty
Date: 00:53:39 10/08/04
Go up one level in this thread
Previous message was accidentally sent while I was typing it, and it's wrong so please ignore... >I did some more research on this-- >We are assuming here that an engine's win percentage follows a binomial >distribution with probability p. meaning that, out of "n" games played (with no >draws), we can expect the engine to win n*p games on average. Right. >3) the interval is then given approximately by: > >phat +/- z(a/2) sqrt ( phat * (1-phat) / n ) > This is usually a pretty good approximation. It's even better if you use it in reverse and calculate the confidence interval for phat from the known null-hypothesis value; in this case the the null-hypothesis is that the engines are equal strength (i.e. win percentage 0.5) and so the bounds of the confidence interval are phat = 0.5 +/- z(a/2) sqrt(0.5*0.5/n) or abs((2*phat-1) sqrt(n)) = z(a/t) In the language of my original post, (where phat=W/(W+L), n=R, z(a/t)=T) T=abs(S/sqrt(R)) Which is the formula from my original post. In answer to your other points, 1. You drop draw scores-- you should be either rolling them into wins or losses or using a multinomial distribution model. Draws are certainly relevant if I'm trying to decide HOW MUCH better one engine is than another. They're not relevant to the question "does A beat B more than B beats A" which is why I restricted my whole note to this case. 2. You did not state that which distribution you believed the number of wins - number of losses to follow. If the sample is random, and independent, and identically distributed (iid) then the distribution has to be binomial. 2a. Your sample is not random, although if this continues to be a picking point, I will drop it as the rest of my claims are valid and can't be refuted. There's no statistical test to prove a sample is iid; there are tests to prove it's not. In my testing I've not found any nonrandomness but since the whole method hinges on this I'd be very interested in any evidence that it's not. 3. Your transformation of variables and your test on the statistic "t" is not valid-- you might be assuming a normal model, but I can't know that since you didn't say so. Nevertheless, even if you did, I can show you that your transformations do not lead to a distribution that can be safetly approximated by the normal distribution. The statistics in your followup post approximated the binomial distribution with a normal one, and the formula you gave is virtually identical to my original one. I'll assume I've convinced you on this, please let me know if I haven't. The normal distribution is a quite accurate approximation to the binomial when phat is close to 0.5.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.