Author: John Sidles
Date: 20:08:08 02/18/06
Go up one level in this thread
On February 18, 2006 at 18:50:28, Thomas Mayer wrote: >Hi John, > >On February 18, 2006 at 17:28:15, John Sidles wrote: > >> nGame = 100, confidence = 95.%, score = 58.5/100 >> nGame = 100, confidence = 99.%, score = 61./100 >> nGame = 100, confidence = 99.9%, score = 64./100 > >interesting. This looks way too low to me... You mean a 58.5 - 41.5 means that >the winning engine is by 95% confidence stronger then the loser... I would have >expected at least something like 80-20... How did you calculate that ? > >Greets, Thomas What this means is just what you say, namely, that if we assume that two programs are in fact of equal strength, with probability 1/3 of winning, losing, and drawing in each game, then in a 100 game match we can be 95% confident that the match score will be in the range 41.5--58.5. So if the match score is *outside* this range, we are justified in suspecting that the programs are *not* of equal strength ... because there is only a 5% probability of seeing such a large mismatch between equal-strength programs. The way the statisticians say this is awkward, but mathematically correct: we reject the null hypothesis (that the programs are of equal strength) with 95% confidence. By the way, the confidence levels would be slightly different if the null hypothesis probabilities (win,lose,draw) were (1/4,1/4,1/2) instead of (1/3,1/3,1/3). Here, for completeness,I post several cases. One take-home message is that a high probability of draws makes it harder to win a match by luck. Maybe this is why high-level players will accept a draw readily, rather than take any chance of losing. How do I calculate these? Well, I just calculate every possible match score, assign it the appropriate probability, and add it up! The computer doesn't mind. ----------------------------------------------------------------------- Null hypothesis is that (win,lose,draw) = (1/4,1/4,1/2), i.e., many draws nGame = 3, confidence = 95.%, score = 3./3 nGame = 4, confidence = 95.%, score = 4./4 nGame = 4, confidence = 99.%, score = 4./4 nGame = 5, confidence = 95.%, score = 4.5/5 nGame = 5, confidence = 99.%, score = 5./5 nGame = 6, confidence = 95.%, score = 5./6 nGame = 6, confidence = 99.%, score = 5.5/6 nGame = 6, confidence = 99.9%, score = 6./6 nGame = 7, confidence = 95.%, score = 6./7 nGame = 7, confidence = 99.%, score = 6.5/7 nGame = 7, confidence = 99.9%, score = 7./7 nGame = 8, confidence = 95.%, score = 6.5/8 nGame = 8, confidence = 99.%, score = 7./8 nGame = 8, confidence = 99.9%, score = 7.5/8 nGame = 9, confidence = 95.%, score = 7./9 nGame = 9, confidence = 99.%, score = 7.5/9 nGame = 9, confidence = 99.9%, score = 8.5/9 nGame = 10, confidence = 95.%, score = 7.5/10 nGame = 10, confidence = 99.%, score = 8.5/10 nGame = 10, confidence = 99.9%, score = 9./10 nGame = 15, confidence = 95.%, score = 10.5/15 nGame = 15, confidence = 99.%, score = 11.5/15 nGame = 15, confidence = 99.9%, score = 12.5/15 nGame = 15, confidence = 99.9999%, score = 14./15 nGame = 20, confidence = 95.%, score = 13.5/20 nGame = 20, confidence = 99.%, score = 14.5/20 nGame = 20, confidence = 99.9%, score = 15.5/20 nGame = 20, confidence = 99.9999%, score = 18./20 nGame = 30, confidence = 95.%, score = 19.5/30 nGame = 30, confidence = 99.%, score = 20.5/30 nGame = 30, confidence = 99.9%, score = 22./30 nGame = 30, confidence = 99.9999%, score = 24.5/30 nGame = 40, confidence = 95.%, score = 25./40 nGame = 40, confidence = 99.%, score = 26./40 nGame = 40, confidence = 99.9%, score = 28./40 nGame = 40, confidence = 99.9999%, score = 31./40 nGame = 50, confidence = 95.%, score = 30.5/50 nGame = 50, confidence = 99.%, score = 32./50 nGame = 50, confidence = 99.9%, score = 33.5/50 nGame = 50, confidence = 99.9999%, score = 37.5/50 nGame = 75, confidence = 95.%, score = 44./75 nGame = 75, confidence = 99.%, score = 46./75 nGame = 75, confidence = 99.9%, score = 48./75 nGame = 75, confidence = 99.9999%, score = 53./75 nGame = 100, confidence = 95.%, score = 57.5/100 nGame = 100, confidence = 99.%, score = 59.5/100 nGame = 100, confidence = 99.9%, score = 62./100 nGame = 100, confidence = 99.9999%, score = 67.5/100 ----------------------------------------------------------------------- Null hypothesis is that (win,lose,draw) = (1/3,1/3,1/3), i.e., moderate draws nGame = 4, confidence = 95.%, score = 4./4 nGame = 5, confidence = 95.%, score = 4.5/5 nGame = 5, confidence = 99.%, score = 5./5 nGame = 6, confidence = 95.%, score = 5.5/6 nGame = 6, confidence = 99.%, score = 6./6 nGame = 7, confidence = 95.%, score = 6./7 nGame = 7, confidence = 99.%, score = 6.5/7 nGame = 7, confidence = 99.9%, score = 7./7 nGame = 8, confidence = 95.%, score = 6.5/8 nGame = 8, confidence = 99.%, score = 7.5/8 nGame = 8, confidence = 99.9%, score = 8./8 nGame = 9, confidence = 95.%, score = 7.5/9 nGame = 9, confidence = 99.%, score = 8./9 nGame = 9, confidence = 99.9%, score = 9./9 nGame = 10, confidence = 95.%, score = 8./10 nGame = 10, confidence = 99.%, score = 8.5/10 nGame = 10, confidence = 99.9%, score = 9.5/10 nGame = 15, confidence = 95.%, score = 11./15 nGame = 15, confidence = 99.%, score = 12./15 nGame = 15, confidence = 99.9%, score = 13./15 nGame = 15, confidence = 99.9999%, score = 15./15 nGame = 20, confidence = 95.%, score = 14./20 nGame = 20, confidence = 99.%, score = 15./20 nGame = 20, confidence = 99.9%, score = 16.5/20 nGame = 20, confidence = 99.9999%, score = 19./20 nGame = 30, confidence = 95.%, score = 20./30 nGame = 30, confidence = 99.%, score = 21./30 nGame = 30, confidence = 99.9%, score = 22.5/30 nGame = 30, confidence = 99.9999%, score = 26./30 nGame = 40, confidence = 95.%, score = 25.5/40 nGame = 40, confidence = 99.%, score = 27./40 nGame = 40, confidence = 99.9%, score = 29./40 nGame = 40, confidence = 99.9999%, score = 32.5/40 nGame = 50, confidence = 95.%, score = 31./50 nGame = 50, confidence = 99.%, score = 33./50 nGame = 50, confidence = 99.9%, score = 35./50 nGame = 50, confidence = 99.9999%, score = 39./50 nGame = 75, confidence = 95.%, score = 45./75 nGame = 75, confidence = 99.%, score = 47./75 nGame = 75, confidence = 99.9%, score = 49.5/75 nGame = 75, confidence = 99.9999%, score = 55./75 nGame = 100, confidence = 95.%, score = 58.5/100 nGame = 100, confidence = 99.%, score = 61./100 nGame = 100, confidence = 99.9%, score = 64./100 nGame = 100, confidence = 99.9999%, score = 70./100 ----------------------------------------------------------------------- Null hypothesis is that (win,lose,draw) = (1/2,1/2,0), i.e., no draws nGame = 6, confidence = 95.%, score = 5.5/6 nGame = 7, confidence = 95.%, score = 6.5/7 nGame = 8, confidence = 95.%, score = 7.5/8 nGame = 8, confidence = 99.%, score = 7.5/8 nGame = 9, confidence = 95.%, score = 7.5/9 nGame = 9, confidence = 99.%, score = 8.5/9 nGame = 10, confidence = 95.%, score = 8.5/10 nGame = 10, confidence = 99.%, score = 9.5/10 nGame = 15, confidence = 95.%, score = 11.5/15 nGame = 15, confidence = 99.%, score = 12.5/15 nGame = 15, confidence = 99.9%, score = 13.5/15 nGame = 20, confidence = 95.%, score = 14.5/20 nGame = 20, confidence = 99.%, score = 16.5/20 nGame = 20, confidence = 99.9%, score = 17.5/20 nGame = 30, confidence = 95.%, score = 20.5/30 nGame = 30, confidence = 99.%, score = 22.5/30 nGame = 30, confidence = 99.9%, score = 24.5/30 nGame = 30, confidence = 99.9999%, score = 27.5/30 nGame = 40, confidence = 95.%, score = 26.5/40 nGame = 40, confidence = 99.%, score = 28.5/40 nGame = 40, confidence = 99.9%, score = 30.5/40 nGame = 40, confidence = 99.9999%, score = 35.5/40 nGame = 50, confidence = 95.%, score = 32.5/50 nGame = 50, confidence = 99.%, score = 34.5/50 nGame = 50, confidence = 99.9%, score = 36.5/50 nGame = 50, confidence = 99.9999%, score = 42.5/50 nGame = 75, confidence = 95.%, score = 46.5/75 nGame = 75, confidence = 99.%, score = 49.5/75 nGame = 75, confidence = 99.9%, score = 52.5/75 nGame = 75, confidence = 99.9999%, score = 58.5/75 nGame = 100, confidence = 95.%, score = 60.5/100 nGame = 100, confidence = 99.%, score = 63.5/100 nGame = 100, confidence = 99.9%, score = 66.5/100 nGame = 100, confidence = 99.9999%, score = 74.5/100
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.