Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Test match with the Botvinnik-Markoff extension

Author: Sune Fischer

Date: 13:20:21 10/03/03

Go up one level in this thread


On October 03, 2003 at 15:22:37, Dieter Buerssner wrote:

>On October 03, 2003 at 10:05:06, Tord Romstad wrote:
>
>>After 200 games, the score was 111.5-89.5.
>
>Which does add up to 201.
>
>>Could you do a similar
>>calculation with these numbers (or better yet, teach me how to do it
>>myself)?
>
>I was too lazy (or unable :-) to figure it out analytically. However, a Monte
>Carlo simulation of a 200 game match between equal opponents, where the games
>are considered independant (no learning, wide book to avoid repeated games,
>...), and assuming white wins 40% of the games, black wins 30% of the games, and
>30% are drawn, shows:

What you are doing is to find the probability of the result using an already
given probability distribution.

I think there is a better question to be asked.

What you want to do is really the reverse, to calculate the probabilities _from_
the data.
The questions is, with what level of confidence can we claim that the program
scoring higher is also better?

We know that the confidence level grows as the number of games increases, so it
has to be part of the formula somehow.

-S.

>C:\e\dcrand>cmatch 200 40 30 30 1000000
>Result of chess matches between equal opponents
>White wins 40.0%, black 30.0% and 30.0% draws
>A match of 200 games was simulated by 1000000 Monte Carlo tries
>
>               result       probability         <= this          > this
>100.0 - 100.0 ( 50.0%):          3.398%          3.398%         96.602%
>100.5 - 99.5  ( 50.3%):          6.800%         10.198%         89.802%
>101.0 - 99.0  ( 50.5%):          6.649%         16.847%         83.153%
>101.5 - 98.5  ( 50.8%):          6.565%         23.412%         76.588%
>102.0 - 98.0  ( 51.0%):          6.420%         29.832%         70.168%
>102.5 - 97.5  ( 51.3%):          6.157%         35.988%         64.012%
>103.0 - 97.0  ( 51.5%):          5.954%         41.942%         58.058%
>103.5 - 96.5  ( 51.8%):          5.719%         47.661%         52.339%
>104.0 - 96.0  ( 52.0%):          5.404%         53.065%         46.935%
>104.5 - 95.5  ( 52.3%):          5.063%         58.128%         41.872%
>105.0 - 95.0  ( 52.5%):          4.734%         62.862%         37.138%
>105.5 - 94.5  ( 52.8%):          4.417%         67.280%         32.720%
>106.0 - 94.0  ( 53.0%):          4.053%         71.333%         28.667%
>106.5 - 93.5  ( 53.3%):          3.663%         74.996%         25.004%
>107.0 - 93.0  ( 53.5%):          3.313%         78.309%         21.691%
>107.5 - 92.5  ( 53.8%):          2.985%         81.294%         18.706%
>108.0 - 92.0  ( 54.0%):          2.708%         84.002%         15.998%
>108.5 - 91.5  ( 54.3%):          2.412%         86.415%         13.585%
>109.0 - 91.0  ( 54.5%):          2.103%         88.518%         11.482%
>109.5 - 90.5  ( 54.8%):          1.823%         90.341%          9.659%
>110.0 - 90.0  ( 55.0%):          1.587%         91.927%          8.073%
>110.5 - 89.5  ( 55.3%):          1.367%         93.295%          6.705%
>111.0 - 89.0  ( 55.5%):          1.171%         94.465%          5.535%
>111.5 - 88.5  ( 55.8%):          0.981%         95.447%          4.553%
>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>112.0 - 88.0  ( 56.0%):          0.837%         96.284%          3.716%
>112.5 - 87.5  ( 56.3%):          0.711%         96.995%          3.005%
>113.0 - 87.0  ( 56.5%):          0.584%         97.579%          2.421%
>113.5 - 86.5  ( 56.8%):          0.485%         98.064%          1.936%
>114.0 - 86.0  ( 57.0%):          0.393%         98.457%          1.543%
>114.5 - 85.5  ( 57.3%):          0.329%         98.786%          1.214%
>115.0 - 85.0  ( 57.5%):          0.269%         99.055%          0.945%
>115.5 - 84.5  ( 57.8%):          0.212%         99.267%          0.733%
>116.0 - 84.0  ( 58.0%):          0.169%         99.436%          0.564%
>116.5 - 83.5  ( 58.3%):          0.130%         99.566%          0.434%
>117.0 - 83.0  ( 58.5%):          0.108%         99.674%          0.326%
>117.5 - 82.5  ( 58.8%):          0.083%         99.757%          0.243%
>[...]
>
>The above line means: in 95.4% of such matches between equal opponents the
>result will be less extreme or equal to 111.5-89.5. In 4.6% of such matches, one
>can expect a more extreme result. Or in other words, it is quite likely, that
>the winner is better.
>
>The Monte Carlo simulation can be done easily. Just roll a dice with numbers
>1-100. If the result is between 1 and 40, white wins. If the result is between
>41 and 70, black wins. If the result is between 71 and 100, it is draw (for the
>numbers above). Roll the dice 200 times, and add up the results (don't forget to
>switch colors). Repeat this many times, and calculate an average.
>
>Writing a good routine to roll a dice is more tricky, than it may seem at first
>sight. Obviously one will use a pseudo random number generator (PRNG). But the
>straightforward methods (all the ones I have read about in books or for example
>in the comp.lang.c FAQ) still have a bias, when the maximim return of the PRNG+1
>is not devidable by the number of faces of the dice without remainder.
>
>I suggest something like
>
>/* returns 0 <= r < range equally distributed */
>unsigned long rand_range(unsigned long range)
>{
>  unsigned long rmax, r, d;
>  /* find the largest number rmax <= MY_RAND_MAX, for which
>     (rmax+1) % range == 0.
>     All returns from rand() > rmax will be skipped, to guarantee
>     equal probability for all return values. */
>  d = (MY_RAND_MAX+1-range) / range + 1; /* Ignore possible compiler warning */
>  rmax = d * range - 1; /* -1 to avoid "overflow to zero" */
>  do
>    r = MY_PRNG();
>  while (r > rmax);
>  return r/d;
>}
>
>Remi has written a nice article, where he looks at it analytically; you will
>find it easily, when you follow the link Peter has suggested.
>
>Regards,
>Dieter



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.