Author: Martin Giepmans
Date: 12:45:40 12/31/01
Go up one level in this thread
On December 31, 2001 at 10:33:20, Andrew Dados wrote: >On December 31, 2001 at 10:18:11, Robert Pope wrote: > >>On December 31, 2001 at 09:44:52, Andrew Dados wrote: >> >>> >>>Suppose I am getting tons of scores for some experiment which outcome will obey >>>known distribution (In my problem it is Poisson distribution; type of >>>distribution should not matter). >>> >>>I can't store all scores, but I need to know average and mean parameters, so I >>>could recreate distribution function at some time later. How can I store some >>>set of data as small as possible to be able to add new scores to it and still >>>get my mean/sigma right? >>> >>>Example: One experiment is 1000 tosses of a coin. In this case outcome is number >>>of heads. I will collect unspecified number of such results. In this case I >>>could simply store an array of 1000 counters, but I can't afford it. Average >>>number can be easily stored and incrementally updated with 2 ints: total sum and >>>number of experiments. Can some similar trick be done to recalculate mean value >>>after new score comes in? >>> >>>Chess example (closer to my problem): I have a chess position for which I am >>>getting time-to-solve results from many players. So their rating distribution is >>>'predefined' here. The more samples I will collect, the more accurately I can >>>assing a rating for some new player solving this position. I can not collect all >>>separate times-to-solve. So for each player I need to update some totals to be >>>able to calculate mean from those totals (average is easy). Can this be >>>accurately done? >>> >>>..and no... while it sounds like that - it is not some school assignment. :) >>> >>>-Andrew- >> >>I'm not sure about your terminology here. In statistics, mean _is_ the average, >>the way most people think about it. Do you intend to say standard deviation? >> >>The poisson distribution only has one parameter, the mean (sum(Xi)/N). The >>standard deviation, sigma, is equal to the mean by definition. It sounds like >>you already know how to update this statistic. E.g. If you know the number of >>prior observations included in your current sample mean, N, you can update the >>sample mean with a new observation like this: newMean = >>(oldMean+(X[i+1]/N))*N/(N+1). Or you can keep a running total Sum(X[i]) and a >>running total N. > >Thanks.. and indeed I didn't bother looking up poisson distribution formulas. >I've always confused mean with SD in english... > >However question still stands for other distributions. Can standard deviation be >incrementally re-calculated in similar way? Or do I have to approximate it with >some 'delta average' tricks, which are way too rough. > >-Andrew- Yes, it can be done. Remember: - n - s1 = X1+X2+...+Xn - s2 = X1^2+X1^2+...+Xn^2 Now you can calculate the mean m = s1/n and the variance v = sum (Xi-m)^2 = sum (Xi^2) + sum (m^2) - sum (2*Xi*m) = s2 + n*m^2 - 2*m*s1 Once you have you can calculate sigma (take squareroot etc) Martin
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.