Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Computer vs. Human Strength - The Statistics Updated for Van der Wiel

Author: Walter Koroljow

Date: 16:20:57 01/17/01

Go up one level in this thread


On January 17, 2001 at 14:55:14, Hermano Ecuadoriano wrote:

>On January 17, 2001 at 11:33:25, Walter Koroljow wrote:
>
>>In September, I posted an analysis to the effect that the average rating of the
>>PC programs in Chris Carson's database was between 2502 and 2595 with more than
>>95% probability.
>>
>>After the Van der Wiel match, it is time for an update.  The probability is over
>>95% that the average rating of the PC programs is between 2503 and 2594.  The
>>spread has gone down a bit.
>>
>>Van der Wiel's rating is 2493.  Therefore Rebel's performance rating for the
>>match was 2560 which lies in the middle of 2503 and 2594.  Not much change could
>>be expected from this match, and there wasn't.
>>
>>I included all PCs running processors at or above 200MHz.  This consisted of 30
>>program/PC combinations playing 163 games (+73,=57,-27) for a score of 105-58
>>(64.4%) against opposition with an average rating of 2418.
>>
>>VERY QUICK OVERVIEW OF CALCULATIONS: On 9/5/00 I posted
>>(http://site2936.dellhost.com/forums/1/message.shtml?128346) giving excruciating
>>detail on the calculations.  This is an overview.  The one major constant
>>assumption is that results of individual games are independent.  The calculation
>>then assumes a trial average rating and a spread for the ratings.  A Monte Carlo
>>simulation runs the 163 games one million times and computes the probabilities
>>of the results.  If the probability calculated for what actually happened is
>>less than 5%, the trial mean and the spread can be rejected.  Spreads from 0 to
>>400 were tried, and in all cases, averages below 2503 and above 2594 had to be
>>rejected.  The effect of the spread was minimal.
>
>Thank you.
>My memory of the Monte Carlo method is dim, but according to my understanding
>this looks good.
>I wonder if our resident statistician is studying this?

I also wonder.

>
>Would you comment on the appropriateness of the usual application of the
>binomial distribution to this material?

Be careful.  Draws make a difference.  The binomial (with p = 0.5) will give the
right answer for the case in which the probability of win = probability of loss
= 0.5.  But it will give too much spread for the case probability of win =
probability of loss = probability of draw = 1/3.

My first impulse was to do this analysis assuming a mean rating and then
calculating a distribution (e.g., a binomial, or taking draws into account, a
trinomial) of results analytically.  One could then reject those means that gave
a low probability (using the distribution) of what really happened.  I gave up
on this approach because I couldn't calculate an exact distribution taking into
account a non-zero spread of computer ratings.

But I could do it for the zero-spread case.  The answers I got agreed with the
simulation answers to 3-4 significant figures.

>
>"Statistics" makes a good partner with the "Humanities", for obvious reasons.
>(I'm sorry that I could not prove that its prominence in Physics is caused by
>bad epistemology.)

The bad epistemology in Physics that annoys me is the attempt to embellish the
purely statistical interpretation of quantum mechanics.  Physicists seem unhappy
with just saying, "Well, the equations mean that if we do the experiment many,
many times, we will get result X 43.4% of the time." They insist on unverifiable
explanations of what goes on in a single experiment.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.