Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: [OT] - probability algorithm question

Author: Robert Pope
Date: 08:25:10 12/31/01
On December 31, 2001 at 10:33:20, Andrew Dados wrote:

>On December 31, 2001 at 10:18:11, Robert Pope wrote:
>
>>On December 31, 2001 at 09:44:52, Andrew Dados wrote:
>>
>>>
>>>Suppose I am getting tons of scores for some experiment which outcome will obey
>>>known distribution (In my problem it is Poisson distribution; type of
>>>distribution should not matter).
>>>
>>>I can't store all scores, but I need to know average and mean parameters, so I
>>>could recreate distribution function at some time later. How can I store some
>>>set of data as small as possible to be able to add new scores to it and still
>>>get my mean/sigma right?
>>>
>>>Example: One experiment is 1000 tosses of a coin. In this case outcome is number
>>>of heads. I will collect unspecified number of such results. In this case I
>>>could simply store an array of 1000 counters, but I can't afford it. Average
>>>number can be easily stored and incrementally updated with 2 ints: total sum and
>>>number of experiments. Can some similar trick be done to recalculate mean value
>>>after new score comes in?
>>>
>>>Chess example (closer to my problem): I have a chess position for which I am
>>>getting time-to-solve results from many players. So their rating distribution is
>>>'predefined' here. The more samples I will collect, the more accurately I can
>>>assing a rating for some new player solving this position. I can not collect all
>>>separate times-to-solve. So for each player I need to update some totals to be
>>>able to calculate mean from those totals (average is easy). Can this be
>>>accurately done?
>>>
>>>..and no... while it sounds like that - it is not some school assignment. :)
>>>
>>>-Andrew-
>>
>>I'm not sure about your terminology here.  In statistics, mean _is_ the average,
>>the way most people think about it.  Do you intend to say standard deviation?
>>
>>The poisson distribution only has one parameter, the mean (sum(Xi)/N).  The
>>standard deviation, sigma, is equal to the mean by definition.  It sounds like
>>you already know how to update this statistic.  E.g. If you know the number of
>>prior observations included in your current sample mean, N, you can update the
>>sample mean with a new observation like this:  newMean =
>>(oldMean+(X[i+1]/N))*N/(N+1).  Or you can keep a running total Sum(X[i]) and a
>>running total N.
>
>Thanks.. and indeed I didn't bother looking up poisson distribution formulas.
>I've always confused mean with SD in english...
>
>However question still stands for other distributions. Can standard deviation be
>incrementally re-calculated in similar way? Or do I have to approximate it with
>some 'delta average' tricks, which are way too rough.
>
>-Andrew-
If memory serves,
stdDev^2 = Variance = E[X]^2 - E[X^2] = sum(X[i]/n)^2 - sum(X[i]*X[i]/n).
So, you should be able to just add a third counter, sum(X[i]^2), and then
recalculate variance or standard deviation each iteration similarly to the mean.

Most of the common distributions are one- or two-parameter probability
functions.  But there are some with more, and if you truly want to be able to
use _any_ distribution, you would also have to look into kurtosis and other
E[X^k] values.  But I don't think you want to open that can of worms and it's
been to long for me to be much help there.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.