Author: Rolf Tueschen
Date: 17:15:14 01/22/04
Go up one level in this thread
On January 22, 2004 at 12:53:16, Christophe Theron wrote: >On January 21, 2004 at 20:00:12, Kolss wrote: > >>Hi, >> >>How many games you need depends on what you want to show, of course... :-) >>If my calculations are correct, I get the following: >> >>Shredder 8 vs. Shredder 7.04: >> >>+90 -65 =145 >> >>=> 162.5 - 137.5 >> >>=> 54.17 % >> >>=> >>Elo difference = +29 >>95 % confidence interval: [+1, +58] >> >>That means that based on this 300-game match (for this particular time control >>on this particular computer with these particular settings etc.), your best >>guess is that S8 is 29 Elo points better than S7.04 (highest likelihood for that >>value); there is a 95 % chance that S8 is between 1 and 58 Elo points better; >>and the likelihood that S8 is (at least 1 Elo point) better than S7.04 is 97.5 >>%. This is wrong. Stats doesn't work this way. In your example above 1 Elo is as probable as 58 Elo. There is no way to hypostate that Elo 29 is the "best" guess. With a defined confidence int. of 95% you get a variance of 1 to 58 Elo points. Then you look how your results are differing for two progs. All results between 1 and 58 tell you nothing about differences! You still have to admit that the two progs could be equally strong. You need at least Elo +-59 for a claim of being better or worse. - NB you propose that the two progs are equally strong and then you test against it. You must top 58. [all this on the base of a specific N of games, the results calculated in Elo; I didn't follow the debate but normally you calculate with scores from the games/matches just for mentioning it] Rolf >> >>So if you "only" want to show that S8 is better, you can - statistically >>speaking - stop now. If you want to "prove" that it is more than 20 Elo points >>better, you need a few more games indeed... >> >>Best regards - Munjong. > > > >It's great to see that at least one guy is able to correctly interpret match >results here. > >I hope you will post more often on this subject. Information on it is very much >needed here. > > > > Christophe
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.