Author: Dann Corbit
Date: 19:09:00 01/23/04
Go up one level in this thread
On January 23, 2004 at 20:58:51, Bob Durrett wrote: >On January 23, 2004 at 20:37:02, Christophe Theron wrote: > >>On January 23, 2004 at 14:31:59, Bob Durrett wrote: >> >>>On January 23, 2004 at 14:20:43, Christophe Theron wrote: >>> >>>>On January 23, 2004 at 07:08:07, Kolss wrote: >>>> >>>>>On January 22, 2004 at 12:53:16, Christophe Theron wrote: >>>>> >>>>>>On January 21, 2004 at 20:00:12, Kolss wrote: >>>>>> >>>>>>>Hi, >>>>>>> >>>>>>>How many games you need depends on what you want to show, of course... :-) >>>>>>>If my calculations are correct, I get the following: >>>>>>> >>>>>>>Shredder 8 vs. Shredder 7.04: >>>>>>> >>>>>>>+90 -65 =145 >>>>>>> >>>>>>>=> 162.5 - 137.5 >>>>>>> >>>>>>>=> 54.17 % >>>>>>> >>>>>>>=> >>>>>>>Elo difference = +29 >>>>>>>95 % confidence interval: [+1, +58] >>>>>>> >>>>>>>That means that based on this 300-game match (for this particular time control >>>>>>>on this particular computer with these particular settings etc.), your best >>>>>>>guess is that S8 is 29 Elo points better than S7.04 (highest likelihood for that >>>>>>>value); there is a 95 % chance that S8 is between 1 and 58 Elo points better; >>>>>>>and the likelihood that S8 is (at least 1 Elo point) better than S7.04 is 97.5 >>>>>>>%. >>>>>>> >>>>>>>So if you "only" want to show that S8 is better, you can - statistically >>>>>>>speaking - stop now. If you want to "prove" that it is more than 20 Elo points >>>>>>>better, you need a few more games indeed... >>>>>>> >>>>>>>Best regards - Munjong. >>>>>> >>>>>> >>>>>> >>>>>>It's great to see that at least one guy is able to correctly interpret match >>>>>>results here. >>>>>> >>>>>>I hope you will post more often on this subject. Information on it is very much >>>>>>needed here. >>>>> >>>>>Well, as my former English teacher used to say: >>>>> >>>>>"I'm talking to the trees - but they aren't listening to me..." :-) >>>>> >>>>>I guess some people just don't bother trying to consult a *basic* statistics >>>>>book before jumping on you... ;-) >>>>> >>>>>Best regards - Munjong. >>>> >>>> >>>> >>>>Please don't leave the forum and help me educate people! :) >>>> >>>>Actually people do not need to understand all the maths behind the stats (I >>>>don't myself), but just to understand a few basics. For example that a 10 games >>>>match tells mostly nothing. >>>> >>>> >>>> >>>> Christophe >>> >>>Imagine yourself playing a 10 game rated match against one of your peers >>>[someone who sneers and blows smoke in your face] and suppose you lost all ten >>>games? You would then think the match meant a lot! One step away from that >>>would be when the match were played between your chess program and someone >>>else's. Your program would be your "pride and joy" and would, in effect, be >>>your surrogate. I imagine that it would be hard to accept the idea that a ten >>>game loss would be insignificant. It's great to be able to stand back and see >>>things objectively, of course. Generally, I feel that SOME information is >>>provided by every tournament or match no matter how few games are played. I >>>agree in principle, however, that a 5 1/2 to 4 1/2 result in a ten game match >>>would offer little insight into the current playing strengths of the players. >>> >>>Bob D. >> >> >> >>Your last sentence is what I had in mind. 5.5-4.5 as we see so often is not a >>result that allow us to decide which program is better. Even 6.5-3.5 does not >>allow it. And that's what we see all the time, even between programs that are >>supposed to be of very different strength. >> >>So for all practical cases here, a 10 games match is not something I would >>consider interesting. >> >>Of course it can be interesting to replay the games, but for different reasons. >> >> >> >> Christophe > >Yes, I see your point and I agree. > >For SMALL tournaments, exhaustive post-mortem analysis of the games may be the >**only** way to obtain a significant amount of useful information from the >tournament. But they can be just as fun as the big, long-lasting ones. Consider the WMCCC. It proves nothing, but everyone here will be on pins and needles while it is running (including me). On the other hand, sometimes declaring a champion is just what is wanted. Not the same thing as finding out "who's best" of course, but still interesting.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.