Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: I will continue the match until there is a diffence of 7 games

Author: Uri Blass

Date: 12:20:04 12/24/00

Go up one level in this thread


On December 20, 2000 at 19:06:18, Bruce Moreland wrote:

>On December 20, 2000 at 12:17:19, Uri Blass wrote:
>
>>I think that 25 out of 32 is more significant than 107 out of 200.
>
>I don't think it is a matter of opinion.
>
>You have two programs, A and B.  They play 32 games.  Each game is either won or
>lost.  If one side doesn't score 25 or more, you repeat.  If one side scores 25
>or more, you stop and call that program stronger.
>
>You do the same thing with 200 games and use 107 as your stop score.
>
>My experiments showed that for many different rating differences, the odds of
>making a mistake was about the same.  For instance, if there is a rating point
>difference of 25 Elo points, in the 200 case the weaker side will score at least
>107 out of 200 about 7% of the time that someone does it, which will lead you to
>a wrong conclusion.  In the 32 case, the weaker side will score 25 about 8% of
>the time that someone does it, likewise leading you to a wrong conclusion.  So
>your odds of a wrong conclusion are approximately the same.  I found this to be
>the same for many Elo point differences out to about 80 points of delta, at
>which point it was hard to tell, since in both cases the weaker side almost
>never gives you a false indicator.
>
>If anything, 107/200 seems to be a little more significant than 25/32.


1)I can say that the experiment(I will continue until the difference is 7 games
is a perfect experiment if you want to find out which program is stronger
because with all the results with the same difference you have the same
probability to be wrong.

2)I think that there is something wrong in your calculations.

I find that 25/32 is more significant based on this consideration(25/32 means
difference of 18 when 107/200 means difference of 14).

I will explain the reason that difference of 18 is more significant than
difference of 14.


If you assume that the difference between the program is constant 25/32 is
always more significant than 107/200(I assume no draws to do the problem more
simple)

I use probability of 0.53125 for the stronger side to win.

The probability to get 107/200 for the weaker side based on the binomical
distribution is (0.46875^107*0.53125^93)*Choice(200,107)

choice(200,107) is the number of possibility

The probability to get 107/200 for the stronger side is
(0.46875^93*0.53125^107)*choice(200,107)

If you divide the number you can see that the ratio between the probabilities is
(0.46875^14/0.53125^14)

You can see that this number is not dependent in the number of games but only on
the difference.

This means that the probability to be wrong when you have a conclusion is
dependent only on the number 107-93 and on the difference that you assume for
the 2 programs when you do not know which program is stronger.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.