Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: About head or tail (was Upon scientific truth - the nature of informati

Author: Graham Laight

Date: 03:32:00 07/16/00

On July 16, 2000 at 03:34:45, Ed Schröder wrote:

>Right.
>
>A few months ago Christophe posted some interesting stuff here regarding
>this topic and nobody really was in agreement with him (me included) until
>I did an experiment which worked as an eye opener for me. The story is not
>funny and goes like this...
>
>In Rebel Century's Personalities you have the option [Strength of Play=100]
>The value may vary from 1 to 100 and 100 is (of course) the default value.
>
>Lowering this value will cause Rebel to lower its NPS. This opens the
>possibility to create (100% equal!) engines with as only difference
>they run SLOWER.
>
>I was interested to know HOW MANY games it was needed to show that a 10%
>faster version could beat a 10% slower version and with which numbers. So
>I created  two personalities:
>
>FAST.ENG (default settings) [Strength of Play=100]
>SLOW.ENG (default settings) [Strength of Play=80]
>
>and started to play 600 eng-eng games with Rebel's build-in autoplayer
>with pre-defined fixed opening lines both engines had to play with white
>and black.
>
>The personality with as only change [Strength of Play=80] caused Rebel to
>slow down with exactly 10% on the machine the marathon match took place.
>Note that this value (80) may differ on other PC's in case you want to do
>similar experiments.
>
>Here are the results of the 600 games played between the FAST and SLOW
>personalities. The first 300 games were played on a time control of "5
>seconds average". The second 300 games were played on a time control of
>"10 seconds average".
>
>FAST - SLOW   162.5 - 137.5   [ 0:05 ]
>FAST - SLOW   147.0 - 153.0   [ 0:10 ]
>
>The first match of 300 games at 5-secs looks convincing. A 54.1% score
>because of the 10% more speed seems a value one might expect.
>
>But what the crazy result of match-2? Apparently after 300 games it is
>still not enough to proof that the 10% faster version is superior (of
>course it is) but the match score indicates both versions are equal
>which is not true.
>
>So how many games are needed to proof that version X is better than Y?
>
>I am sure I am trying to reinvent the wheel. The casino guys who make
>themselves a good living (with red and black) have figured it all out
>centuries ago. Perhaps there is a FAQ somewhere on Internet that
>explains how many times you have to turn the wheel to get an exact
>50.0% division between red and black. 1000? 2000?
>
>To answer this question I wrote a little program that randomly emulates
>chess matches. It shows that 100 games is nothing, too often scores like
>60-40 appear on the screen. 500 games (and higher) seems to do well as
>most of the time match scores fall within the 49.0 - 51.0 area.
>
>The bad news (in any case for me) is that it hardly makes any sense to
>test candidate program improvements using (even) long matches. Back to
>common sense: 10% = 10% = better. Oh well...
>
>Ed

I remember from doing a statistics ancillary in my computing degree that there
is a distribution (not "normal" distribution) for calculating binary event
result probabilities - but I can't remember what it is called.

However, I have myself written programs to simulate these outcomes, and my
observation is that if you do 10x as many simulations, your accuracy level
increases by one decimal place.

-g

Re: About head or tail (was Upon scientific truth - the nature of informati Ralf Elvsén 04:43:55 07/16/00

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.