Computer Chess Club Archives


Search

Terms

Messages

Subject: Lies, data, and statistics (or something like that)

Author: Johanes Suhardjo

Date: 17:54:43 03/10/02


It has been a long time since I had time to read this forum, but I "found"
something that I'd like to share since it might be useful to some of you.
To make the story short, it is a Perl script to analyze the result of a
match, explanations are in the header of the script.  If this is a horse
that has been beaten dead many times since I blinked, please excuse my
ignorance.

Probably I won't be able to read this forum for another millenium, so if
you have any comment or correction, please email me at johanes@nd.edu.

=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ CUT HERE =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=
#!/usr/bin/perl

# Running my chess program, parrot, against a set of fixed opponents using
# a fix set of starting positions, I am frustrated with the fluctuating
# results.  So, I turned to statistics.
#
# Chess should follow a binomial distribution (theoretically, each game has
# the same probability of a success (win)).  According to a statistic book,
# binomial distribution can be approximated by normal distribution if
#         n p >= 5
# and
#         n (1-p) >= 5
# where n is the number of games
# and   p is the probability of a win
# Basically, it means that we can use normal distribution if neither side
# scores less than 5 points.
#
# I write this script, stat, to analyze the results of parrot matches.
# I store the result of each game into a file in the following format:
# 0.5
# 0
# 0
# 0.5
# 0
# 0
# 1
# and so on.  If the file is called temp, I run "./stat temp"
#
# Johanes Suhardjo, March 8, 2002

$count = 0;
$sum = 0;
$sum2 = 0;

# mean
while (<>)
{
    $x[count] = $_;
    $sum = $sum + $x[count];
    $count++;
}
$average = $sum/$count;
print "From $count games:\n";
print "\tAverage = ", $average, "\n";

# Check if normal distribution is a good approximation
$us = $count * $average;
$them = $count - $us;
if ($us < 5.0 || $them < 5.0)
  {die "Not enough data to use normal distribution.\n"};

# standard deviation
for ($i = 0; $i < $count; $i++)
{
    $sum2 = ($x[i] - $average) * ($x[i] - $average);
}
$stdeviation = sqrt ($sum2 / ($count-1));
print "\tStandard deviation = ", $stdeviation, "\n";

# Approximate standard normal curve area from 0.5 to mean
$z = abs (($average - 0.5) / $stdeviation);
# if $z = 1.645, the area is 0.45, corresponds to 95% confidence level
# if $z = 1.2817, the area is 0.40, corresponds to 90% confidence level
# if $z = 0.84, the area is 0.30, corresponds to 80% confidence level
if ($z >= 1.645) {$area = 0.45;}
elsif ($z >= 1.6) {$area = 0.4452 + ($z-1.6) * 0.0102;}
elsif ($z >= 1.5) {$area = 0.4332 + ($z-1.5) * 0.0120;}
elsif ($z >= 1.4) {$area = 0.4192 + ($z-1.4) * 0.0140;}
elsif ($z >= 1.3) {$area = 0.4032 + ($z-1.3) * 0.0160;}
elsif ($z >= 1.2) {$area = 0.3849 + ($z-1.2) * 0.0183;}
elsif ($z >= 1.1) {$area = 0.3643 + ($z-1.1) * 0.0206;}
elsif ($z >= 1.0) {$area = 0.3413 + ($z-1.0) * 0.0230;}
elsif ($z >= 0.9) {$area = 0.3159 + ($z-0.9) * 0.0254;}
elsif ($z >= 0.8) {$area = 0.2881 + ($z-0.8) * 0.0278;}
else {$area = 0;}

# The verdict
print "\tYour program ";
if ($area >= 0.45) {print "is definitely ";}
elsif ($area >= 0.4) {print "is very likely ";}
elsif ($area >= 0.3) {print "seems to be ";}
else {print "maybe or maybe not ";}
if ($average-0.5 > 0.0) {print "stronger than its opponents.\n";}
else {print "weaker than its opponents.\n";}






This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.