Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: I'm wrong about 10-0 vs 60-40

Author: Walter Koroljow

Date: 09:13:31 02/06/01

Go up one level in this thread


On February 05, 2001 at 22:39:18, Walter Koroljow wrote:

>Thank you for the detailed reply.  I didn't know what the real problem was.  But
>maybe I can help reconcile the two cases you talked about and take a small chip
>out of the general case. As you did, let's forget about draws.  Let's compare
>the probability of getting 60-40, say, with the stronger program to the
>probability of getting it with the weaker program.  In general, a very good
>simple approximation to the answer is:
>
>Ratio = exp(8*eps*(x-N/2))
>
>Where:
>
>x = program score,
>N = number of games played,
>eps = We - 0.5 where We = win expectancy.
>
>This approximation is excellent as long as eps is not too big and as long as N
>is on the order of 10 or more.  Let us try a few cases:
>
>N = 100, x = 60, We = 0.55, eps = 0.05.
>
>Then Ratio = exp (8*0.05*(60-50)) = exp (4) = 54.6.
>
>But, with the usual approach, using the (60,40) binomial term instead, canceling
>some terms in the numerator and denominator, we get
>
>Ratio = (.55/.45)^20 = 55.33.  A 1.3% error.
>
>But the approximation becomes extremely accurate as We approaches 0.5.  Try:
>
>N=100, x = 60, We = 0.51, eps = .01.
>
>Ratio = exp(0.8) = 2.2255.  Compare this to (.51/.49)^20 = 2.2258.
>
>This approximation means that two test results are equivalent as long as they
>have the same value of eps*(x-N/2), since then the chance of confusing the two
>programs is the same for the two results.  So here are some equivalent test
>results with a probability of confusing the two programs at 8%:
>
>N     x     We
>10    10   .56
>20    15   .56
>20    20   .53
>60    40   .53
>60    60   .51
>100   60   .53
>
>It is hard to measure small differences!
>
>For the record, the approximation is derived by approximating the binomial
>distribution of scores with a normal distribution and then using it to calculate
>the ratio of the two probabilities for getting the observed result with the two
>programs.  I can supply more detail if anyone is interested.
>
>Good luck,
>
>Walter

Looking back, I notice that I used x-N/2.  This is all easier to understand if
instead we use x-N/2 = (W-L)/2 where W = wins and L = losses.  Then the formula
becomes:

Ratio = exp(4*eps*(W-L))  where:

W = wins
L = losses
eps = We - 0.5, where We = win expectancy
Ratio = ratio of the two probabilities of getting the result with the two
programs.

Uri Blass first pointed out the importance of W-L in another post.

Now we can have an easier to understand table of equivalent test results which
give a probability of confusion of 8%:

Wins - Losses    Win expectancy
-------------    --------------
    5                 .56
   10                 .53
   20                 .515
   30                 .51
   60                 .505

In all cases, the number of games played does not matter, again first pointed
out by Uri. How is that for counterintuitive?

Of course, this is no longer true for large We where the formula becomes
inaccurate.  However, in tests, the formula is pretty accurate for We = .6, and
even sometimes for We = 0.65.  For example, for the 10-0 case with We = 0.6, the
formula gives 54.6 versus the correct 57.7, and for We = 0.65, the formula gives
403 versus the correct 488.

Cheers,

Walter



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.