Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: what type of result is significant in 100 game match

Author: John Sidles

Date: 14:28:15 02/18/06

Go up one level in this thread


Trickier than I thought!  The previous table could have
errors of up to one-half point.  Here's the final (I hope)
table, extending down to very short tournaments.

For programmers, the interesting thing is tha a tournament
as short as four games can give 95% confidence that a "tweak"
has helped (or hurt) the program, *iff* the modified program
sweeps (or loses) all four games.

The shortest tournament that yields 99.9% confidence is
a seven-game "sweep" (or loss)

The shortest tournament for which a draw is still consonant
with 99.9% confidence is a 9.5/10 score.

Here's the extended table:

nGame = 4, confidence = 95.%, score = 4./4

nGame = 5, confidence = 95.%, score = 4.5/5
nGame = 5, confidence = 99.%, score = 5./5

nGame = 6, confidence = 95.%, score = 5.5/6
nGame = 6, confidence = 99.%, score = 6./6

nGame = 7, confidence = 95.%, score = 6./7
nGame = 7, confidence = 99.%, score = 6.5/7
nGame = 7, confidence = 99.9%, score = 7./7

nGame = 8, confidence = 95.%, score = 6.5/8
nGame = 8, confidence = 99.%, score = 7.5/8
nGame = 8, confidence = 99.9%, score = 8./8

nGame = 9, confidence = 95.%, score = 7.5/9
nGame = 9, confidence = 99.%, score = 8./9
nGame = 9, confidence = 99.9%, score = 9./9

nGame = 10, confidence = 95.%, score = 8./10
nGame = 10, confidence = 99.%, score = 8.5/10
nGame = 10, confidence = 99.9%, score = 9.5/10

nGame = 15, confidence = 95.%, score = 11./15
nGame = 15, confidence = 99.%, score = 12./15
nGame = 15, confidence = 99.9%, score = 13./15

nGame = 20, confidence = 95.%, score = 14./20
nGame = 20, confidence = 99.%, score = 15./20
nGame = 20, confidence = 99.9%, score = 16.5/20

nGame = 30, confidence = 95.%, score = 20./30
nGame = 30, confidence = 99.%, score = 21./30
nGame = 30, confidence = 99.9%, score = 22.5/30

nGame = 40, confidence = 95.%, score = 25.5/40
nGame = 40, confidence = 99.%, score = 27./40
nGame = 40, confidence = 99.9%, score = 29./40

nGame = 50, confidence = 95.%, score = 31./50
nGame = 50, confidence = 99.%, score = 33./50
nGame = 50, confidence = 99.9%, score = 35./50

nGame = 75, confidence = 95.%, score = 45./75
nGame = 75, confidence = 99.%, score = 47./75
nGame = 75, confidence = 99.9%, score = 49.5/75

nGame = 100, confidence = 95.%, score = 58.5/100
nGame = 100, confidence = 99.%, score = 61./100
nGame = 100, confidence = 99.9%, score = 64./100


On February 18, 2006 at 17:07:12, John Sidles wrote:

>On February 18, 2006 at 16:52:34, John Sidles wrote:
>
>Here's the same table, with total score instead of the
>(less well-defined) "plus score"
>
>nGame = 10, confidence = 95.%, score = 7.5/10
>nGame = 10, confidence = 99.%, score = 8./10
>nGame = 10, confidence = 99.9%, score = 9./10
>
>nGame = 20, confidence = 95.%, score = 13.5/20
>nGame = 20, confidence = 99.%, score = 14.5/20
>nGame = 20, confidence = 99.9%, score = 16./20
>
>nGame = 30, confidence = 95.%, score = 19.5/30
>nGame = 30, confidence = 99.%, score = 20.5/30
>nGame = 30, confidence = 99.9%, score = 22./30
>
>nGame = 40, confidence = 95.%, score = 25./40
>nGame = 40, confidence = 99.%, score = 26.5/40
>nGame = 40, confidence = 99.9%, score = 28.5/40
>
>nGame = 50, confidence = 95.%, score = 30.5/50
>nGame = 50, confidence = 99.%, score = 32.5/50
>nGame = 50, confidence = 99.9%, score = 34.5/50
>
>nGame = 75, confidence = 95.%, score = 44.5/75
>nGame = 75, confidence = 99.%, score = 46.5/75
>nGame = 75, confidence = 99.9%, score = 49./75
>
>nGame = 100, confidence = 95.%, score = 58./100
>nGame = 100, confidence = 99.%, score = 60.5/100
>nGame = 100, confidence = 99.9%, score = 63.5/100
>
>
>
>>On February 18, 2006 at 03:50:09, Uri Blass wrote:
>>
>>>My question is based on your experience what is the biggest result that A beat B
>>>in match of 100 games(Noomen match or match based on other positions like Albert
>>>Silver's postions) but still A is not better than B against other programs.
>>
>>Here's a table for how large a plus score you need to see, by either A or B, for
>>you
>>to be confident (at the given level of confident) this plus score was not due to
>>luck.
>>
>>Here "luck" means that A and B actually each have 1/3 chance of win, lose and
>>draw,
>>but that either program was simply lucky enough to achieve a plus score.
>>
>>nGame = 10, confidence = 95.%, score = +2.5
>>nGame = 10, confidence = 99.%, score = +3.
>>nGame = 10, confidence = 99.9%, score = +4.
>>nGame = 20, confidence = 95.%, score = +3.5
>>nGame = 20, confidence = 99.%, score = +4.5
>>nGame = 20, confidence = 99.9%, score = +6.
>>nGame = 30, confidence = 95.%, score = +4.5
>>nGame = 30, confidence = 99.%, score = +5.5
>>nGame = 30, confidence = 99.9%, score = +7.
>>nGame = 40, confidence = 95.%, score = +5.
>>nGame = 40, confidence = 99.%, score = +6.5
>>nGame = 40, confidence = 99.9%, score = +8.5
>>nGame = 50, confidence = 95.%, score = +5.5
>>nGame = 50, confidence = 99.%, score = +7.5
>>nGame = 50, confidence = 99.9%, score = +9.5
>>nGame = 75, confidence = 95.%, score = +7.
>>nGame = 75, confidence = 99.%, score = +9.
>>nGame = 75, confidence = 99.9%, score = +11.5
>>nGame = 100, confidence = 95.%, score = +8.
>>nGame = 100, confidence = 99.%, score = +10.5
>>nGame = 100, confidence = 99.9%, score = +13.5
>>
>>So for example, in a 10 game tournament, if either
>>program achieves a +4 score (or higher), you can be
>>99.9% confident that such a high score was *not*
>>due to luck.
>>
>>Note: these were calculated by brute force in
>>Mathematica.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.