Computer Chess Club Archives


Search

Terms

Messages

Subject: a few statistical demonstrations to help my troubled mind

Author: Joseph Ciarrochi

Date: 14:34:54 01/25/06


I have started a hug 40/4 tournment and rybka has gotten off to a bad start
against fritz9 with f9 leading 9 to 5 (though rybka is leading the tournement).

Now I am a statistician and know that small numbers don't mean that much, but i
can't help thinking "what the heck is wrong with rybka." It is so difficult for
me to let go of this tendency to see patterns.

So, to break myself off this bad habit, i ran a huge statistical simulation. I
set it up so the true winning percentage of rybka against fritz9 is about 57% .
In other words, I created population of wins, losses and draws with a 57% win
rate, with the ratio of wins, draws, and losses being the same as found in the
cegt 40/40 tournment.


Then, I randomly drew ten thousand samples of 20 from this population, just to
see how often "wierd values" would show up.

This sampling procedure is similar to what we do everyday. i.e., I play 20 games
of rybka versus fritz9, with a specific type of hardware, and particular
openings. This is just a sample of possible values, with a certain error rate.

Ok, here are the results of the simuliation

Given a true population score of 57%, or 11.5 out of 20

Rybka gets this score or less           percentage out of 10000 samples

7 - 13                                             1%

7.5-12.5                                           2.2%

8-12                                               4.2%

8.5 -11.5                                           7.1%

9 -  11                                             11%

9.5-10.5                                            17%

10 - 10                                             24%

10.5 - 9.5                                          34%

11 - 9                                              45%

(skipping a few increments)

12.5-7.5                                                 75% (or about 25% of
                                                              time you'll see
                                                              better socres

13-8                                                83% (17% you'll see better)

14.5 - 5.5                                          97% (3% you'll see better)




The reliability of the estimated win percentage will increase with increasing
sample size/number of games (as everybody knows) and with increasing control of
factors that cause error (e.g., making comp play both sides of same opening,
making all conditions as even as possible,etc).

here is another thought. If you run a guantlet, lets say, rybka versus five
engines, then you've multiplied your chances of getting an odd result by 5. So
if we define odd as "something that only occurs 5% of the time", then you have
an approximately 25% chance of observing a result that is odd. If you look at
100 matches, then 5 will probably be pretty odd.

ah, well that helps me get over rybka's slow start in my tournment. Now to
work:)

best
Joseph







This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.