Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: what type of result is significant in 100 game match

Author: John Sidles

Date: 20:08:08 02/18/06

Go up one level in this thread


On February 18, 2006 at 18:50:28, Thomas Mayer wrote:

>Hi John,
>
>On February 18, 2006 at 17:28:15, John Sidles wrote:
>
>> nGame = 100, confidence = 95.%, score = 58.5/100
>> nGame = 100, confidence = 99.%, score = 61./100
>> nGame = 100, confidence = 99.9%, score = 64./100
>
>interesting. This looks way too low to me... You mean a 58.5 - 41.5 means that
>the winning engine is by 95% confidence stronger then the loser... I would have
>expected at least something like 80-20... How did you calculate that ?
>
>Greets, Thomas

What this means is just what you say, namely, that if we assume that two
programs
are in fact of equal strength, with probability 1/3 of winning, losing, and
drawing
in each game, then in a 100 game match we can be 95% confident that the match
score will be in the range 41.5--58.5.

So if the match score is *outside* this range, we are justified in suspecting
that the programs are *not* of equal strength ... because there is only a 5%
probability of seeing such a large mismatch between equal-strength programs.

The way the statisticians say this is awkward, but mathematically correct: we
reject
the null hypothesis (that the programs are of equal strength) with 95%
confidence.

By the way, the confidence levels would be slightly different if the null
hypothesis
probabilities (win,lose,draw) were (1/4,1/4,1/2) instead of (1/3,1/3,1/3).
Here,
for completeness,I post several cases.

One take-home message is that a high probability of draws makes it harder to
win a match by luck.  Maybe this is why high-level players will accept a draw
readily, rather than take any chance of losing.

How do I calculate these?  Well, I just calculate every possible match score,
assign it the appropriate probability, and add it up!  The computer doesn't
mind.

-----------------------------------------------------------------------
Null hypothesis is that (win,lose,draw) = (1/4,1/4,1/2),
i.e., many draws

nGame = 3, confidence = 95.%, score = 3./3

nGame = 4, confidence = 95.%, score = 4./4
nGame = 4, confidence = 99.%, score = 4./4

nGame = 5, confidence = 95.%, score = 4.5/5
nGame = 5, confidence = 99.%, score = 5./5

nGame = 6, confidence = 95.%, score = 5./6
nGame = 6, confidence = 99.%, score = 5.5/6
nGame = 6, confidence = 99.9%, score = 6./6

nGame = 7, confidence = 95.%, score = 6./7
nGame = 7, confidence = 99.%, score = 6.5/7
nGame = 7, confidence = 99.9%, score = 7./7

nGame = 8, confidence = 95.%, score = 6.5/8
nGame = 8, confidence = 99.%, score = 7./8
nGame = 8, confidence = 99.9%, score = 7.5/8

nGame = 9, confidence = 95.%, score = 7./9
nGame = 9, confidence = 99.%, score = 7.5/9
nGame = 9, confidence = 99.9%, score = 8.5/9

nGame = 10, confidence = 95.%, score = 7.5/10
nGame = 10, confidence = 99.%, score = 8.5/10
nGame = 10, confidence = 99.9%, score = 9./10

nGame = 15, confidence = 95.%, score = 10.5/15
nGame = 15, confidence = 99.%, score = 11.5/15
nGame = 15, confidence = 99.9%, score = 12.5/15
nGame = 15, confidence = 99.9999%, score = 14./15

nGame = 20, confidence = 95.%, score = 13.5/20
nGame = 20, confidence = 99.%, score = 14.5/20
nGame = 20, confidence = 99.9%, score = 15.5/20
nGame = 20, confidence = 99.9999%, score = 18./20

nGame = 30, confidence = 95.%, score = 19.5/30
nGame = 30, confidence = 99.%, score = 20.5/30
nGame = 30, confidence = 99.9%, score = 22./30
nGame = 30, confidence = 99.9999%, score = 24.5/30

nGame = 40, confidence = 95.%, score = 25./40
nGame = 40, confidence = 99.%, score = 26./40
nGame = 40, confidence = 99.9%, score = 28./40
nGame = 40, confidence = 99.9999%, score = 31./40

nGame = 50, confidence = 95.%, score = 30.5/50
nGame = 50, confidence = 99.%, score = 32./50
nGame = 50, confidence = 99.9%, score = 33.5/50
nGame = 50, confidence = 99.9999%, score = 37.5/50

nGame = 75, confidence = 95.%, score = 44./75
nGame = 75, confidence = 99.%, score = 46./75
nGame = 75, confidence = 99.9%, score = 48./75
nGame = 75, confidence = 99.9999%, score = 53./75

nGame = 100, confidence = 95.%, score = 57.5/100
nGame = 100, confidence = 99.%, score = 59.5/100
nGame = 100, confidence = 99.9%, score = 62./100
nGame = 100, confidence = 99.9999%, score = 67.5/100


-----------------------------------------------------------------------
Null hypothesis is that (win,lose,draw) = (1/3,1/3,1/3),
i.e., moderate draws

nGame = 4, confidence = 95.%, score = 4./4

nGame = 5, confidence = 95.%, score = 4.5/5
nGame = 5, confidence = 99.%, score = 5./5

nGame = 6, confidence = 95.%, score = 5.5/6
nGame = 6, confidence = 99.%, score = 6./6

nGame = 7, confidence = 95.%, score = 6./7
nGame = 7, confidence = 99.%, score = 6.5/7
nGame = 7, confidence = 99.9%, score = 7./7

nGame = 8, confidence = 95.%, score = 6.5/8
nGame = 8, confidence = 99.%, score = 7.5/8
nGame = 8, confidence = 99.9%, score = 8./8

nGame = 9, confidence = 95.%, score = 7.5/9
nGame = 9, confidence = 99.%, score = 8./9
nGame = 9, confidence = 99.9%, score = 9./9

nGame = 10, confidence = 95.%, score = 8./10
nGame = 10, confidence = 99.%, score = 8.5/10
nGame = 10, confidence = 99.9%, score = 9.5/10

nGame = 15, confidence = 95.%, score = 11./15
nGame = 15, confidence = 99.%, score = 12./15
nGame = 15, confidence = 99.9%, score = 13./15
nGame = 15, confidence = 99.9999%, score = 15./15

nGame = 20, confidence = 95.%, score = 14./20
nGame = 20, confidence = 99.%, score = 15./20
nGame = 20, confidence = 99.9%, score = 16.5/20
nGame = 20, confidence = 99.9999%, score = 19./20

nGame = 30, confidence = 95.%, score = 20./30
nGame = 30, confidence = 99.%, score = 21./30
nGame = 30, confidence = 99.9%, score = 22.5/30
nGame = 30, confidence = 99.9999%, score = 26./30

nGame = 40, confidence = 95.%, score = 25.5/40
nGame = 40, confidence = 99.%, score = 27./40
nGame = 40, confidence = 99.9%, score = 29./40
nGame = 40, confidence = 99.9999%, score = 32.5/40

nGame = 50, confidence = 95.%, score = 31./50
nGame = 50, confidence = 99.%, score = 33./50
nGame = 50, confidence = 99.9%, score = 35./50
nGame = 50, confidence = 99.9999%, score = 39./50

nGame = 75, confidence = 95.%, score = 45./75
nGame = 75, confidence = 99.%, score = 47./75
nGame = 75, confidence = 99.9%, score = 49.5/75
nGame = 75, confidence = 99.9999%, score = 55./75

nGame = 100, confidence = 95.%, score = 58.5/100
nGame = 100, confidence = 99.%, score = 61./100
nGame = 100, confidence = 99.9%, score = 64./100
nGame = 100, confidence = 99.9999%, score = 70./100

-----------------------------------------------------------------------
Null hypothesis is that (win,lose,draw) = (1/2,1/2,0),
 i.e., no draws

nGame = 6, confidence = 95.%, score = 5.5/6

nGame = 7, confidence = 95.%, score = 6.5/7

nGame = 8, confidence = 95.%, score = 7.5/8
nGame = 8, confidence = 99.%, score = 7.5/8

nGame = 9, confidence = 95.%, score = 7.5/9
nGame = 9, confidence = 99.%, score = 8.5/9

nGame = 10, confidence = 95.%, score = 8.5/10
nGame = 10, confidence = 99.%, score = 9.5/10

nGame = 15, confidence = 95.%, score = 11.5/15
nGame = 15, confidence = 99.%, score = 12.5/15
nGame = 15, confidence = 99.9%, score = 13.5/15

nGame = 20, confidence = 95.%, score = 14.5/20
nGame = 20, confidence = 99.%, score = 16.5/20
nGame = 20, confidence = 99.9%, score = 17.5/20

nGame = 30, confidence = 95.%, score = 20.5/30
nGame = 30, confidence = 99.%, score = 22.5/30
nGame = 30, confidence = 99.9%, score = 24.5/30
nGame = 30, confidence = 99.9999%, score = 27.5/30

nGame = 40, confidence = 95.%, score = 26.5/40
nGame = 40, confidence = 99.%, score = 28.5/40
nGame = 40, confidence = 99.9%, score = 30.5/40
nGame = 40, confidence = 99.9999%, score = 35.5/40

nGame = 50, confidence = 95.%, score = 32.5/50
nGame = 50, confidence = 99.%, score = 34.5/50
nGame = 50, confidence = 99.9%, score = 36.5/50
nGame = 50, confidence = 99.9999%, score = 42.5/50

nGame = 75, confidence = 95.%, score = 46.5/75
nGame = 75, confidence = 99.%, score = 49.5/75
nGame = 75, confidence = 99.9%, score = 52.5/75
nGame = 75, confidence = 99.9999%, score = 58.5/75

nGame = 100, confidence = 95.%, score = 60.5/100
nGame = 100, confidence = 99.%, score = 63.5/100
nGame = 100, confidence = 99.9%, score = 66.5/100
nGame = 100, confidence = 99.9999%, score = 74.5/100






This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.