Author: Bas Hamstra
Date: 05:49:39 07/16/00
Go up one level in this thread
On July 16, 2000 at 08:37:20, Bas Hamstra wrote: >On July 16, 2000 at 03:34:45, Ed Schröder wrote: > >>>posted by Dann Corbit on July 15, 2000 at 20:21:54: >> >>>Simplifying. I have a penny. >>>I toss it twice. >>>Heads, heads. >>>I toss it twice >>>Heads, heads. >>>I toss it twice >>>Tails, heads. >>>I toss it twice >>>Heads, tails. >> >>>I count them up. >> >>>Heads are stronger than tails. >> >>>My conclusion is faulty. Why? Because I did not gather enough data. >> >>Right. >> >>A few months ago Christophe posted some interesting stuff here regarding >>this topic and nobody really was in agreement with him (me included) until >>I did an experiment which worked as an eye opener for me. The story is not >>funny and goes like this... >> >>In Rebel Century's Personalities you have the option [Strength of Play=100] >>The value may vary from 1 to 100 and 100 is (of course) the default value. >> >>Lowering this value will cause Rebel to lower its NPS. This opens the >>possibility to create (100% equal!) engines with as only difference >>they run SLOWER. >> >>I was interested to know HOW MANY games it was needed to show that a 10% >>faster version could beat a 10% slower version and with which numbers. So >>I created two personalities: >> >>FAST.ENG (default settings) [Strength of Play=100] >>SLOW.ENG (default settings) [Strength of Play=80] >> >>and started to play 600 eng-eng games with Rebel's build-in autoplayer >>with pre-defined fixed opening lines both engines had to play with white >>and black. >> >>The personality with as only change [Strength of Play=80] caused Rebel to >>slow down with exactly 10% on the machine the marathon match took place. >>Note that this value (80) may differ on other PC's in case you want to do >>similar experiments. >> >>Here are the results of the 600 games played between the FAST and SLOW >>personalities. The first 300 games were played on a time control of "5 >>seconds average". The second 300 games were played on a time control of >>"10 seconds average". >> >>FAST - SLOW 162.5 - 137.5 [ 0:05 ] >>FAST - SLOW 147.0 - 153.0 [ 0:10 ] >> >>The first match of 300 games at 5-secs looks convincing. A 54.1% score >>because of the 10% more speed seems a value one might expect. >> >>But what the crazy result of match-2? Apparently after 300 games it is >>still not enough to proof that the 10% faster version is superior (of >>course it is) but the match score indicates both versions are equal >>which is not true. >> >>So how many games are needed to proof that version X is better than Y? >> >>I am sure I am trying to reinvent the wheel. The casino guys who make >>themselves a good living (with red and black) have figured it all out >>centuries ago. Perhaps there is a FAQ somewhere on Internet that >>explains how many times you have to turn the wheel to get an exact >>50.0% division between red and black. 1000? 2000? > >You can *never* be sure. But you can say something like this: > >99.7 % of the time the outcome will lay between 50-3*sigma and 50+3*sigma. > >Sigma can be calculated as SQR(n*p*q) where n is the number of games, and p=0.50 >and q=0.50 (q = 1-p). > >Example: if you run 100 games (or throw that much coins, or turn the weel), the >result will be 99.7% of the time between > > 50-SQR(100 * 0.50 * 0.50) > >and > > 50+SQR(100 * 0.50 * 0.50) > >that is between 45% and 55%. Ahem: 99.7% confidence = 3 * sigma, so the result is between 35 and 65. With 95% confidence between 40 and 60.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.