Author: Bob Durrett
Date: 17:58:51 01/23/04
Go up one level in this thread
On January 23, 2004 at 20:37:02, Christophe Theron wrote: >On January 23, 2004 at 14:31:59, Bob Durrett wrote: > >>On January 23, 2004 at 14:20:43, Christophe Theron wrote: >> >>>On January 23, 2004 at 07:08:07, Kolss wrote: >>> >>>>On January 22, 2004 at 12:53:16, Christophe Theron wrote: >>>> >>>>>On January 21, 2004 at 20:00:12, Kolss wrote: >>>>> >>>>>>Hi, >>>>>> >>>>>>How many games you need depends on what you want to show, of course... :-) >>>>>>If my calculations are correct, I get the following: >>>>>> >>>>>>Shredder 8 vs. Shredder 7.04: >>>>>> >>>>>>+90 -65 =145 >>>>>> >>>>>>=> 162.5 - 137.5 >>>>>> >>>>>>=> 54.17 % >>>>>> >>>>>>=> >>>>>>Elo difference = +29 >>>>>>95 % confidence interval: [+1, +58] >>>>>> >>>>>>That means that based on this 300-game match (for this particular time control >>>>>>on this particular computer with these particular settings etc.), your best >>>>>>guess is that S8 is 29 Elo points better than S7.04 (highest likelihood for that >>>>>>value); there is a 95 % chance that S8 is between 1 and 58 Elo points better; >>>>>>and the likelihood that S8 is (at least 1 Elo point) better than S7.04 is 97.5 >>>>>>%. >>>>>> >>>>>>So if you "only" want to show that S8 is better, you can - statistically >>>>>>speaking - stop now. If you want to "prove" that it is more than 20 Elo points >>>>>>better, you need a few more games indeed... >>>>>> >>>>>>Best regards - Munjong. >>>>> >>>>> >>>>> >>>>>It's great to see that at least one guy is able to correctly interpret match >>>>>results here. >>>>> >>>>>I hope you will post more often on this subject. Information on it is very much >>>>>needed here. >>>> >>>>Well, as my former English teacher used to say: >>>> >>>>"I'm talking to the trees - but they aren't listening to me..." :-) >>>> >>>>I guess some people just don't bother trying to consult a *basic* statistics >>>>book before jumping on you... ;-) >>>> >>>>Best regards - Munjong. >>> >>> >>> >>>Please don't leave the forum and help me educate people! :) >>> >>>Actually people do not need to understand all the maths behind the stats (I >>>don't myself), but just to understand a few basics. For example that a 10 games >>>match tells mostly nothing. >>> >>> >>> >>> Christophe >> >>Imagine yourself playing a 10 game rated match against one of your peers >>[someone who sneers and blows smoke in your face] and suppose you lost all ten >>games? You would then think the match meant a lot! One step away from that >>would be when the match were played between your chess program and someone >>else's. Your program would be your "pride and joy" and would, in effect, be >>your surrogate. I imagine that it would be hard to accept the idea that a ten >>game loss would be insignificant. It's great to be able to stand back and see >>things objectively, of course. Generally, I feel that SOME information is >>provided by every tournament or match no matter how few games are played. I >>agree in principle, however, that a 5 1/2 to 4 1/2 result in a ten game match >>would offer little insight into the current playing strengths of the players. >> >>Bob D. > > > >Your last sentence is what I had in mind. 5.5-4.5 as we see so often is not a >result that allow us to decide which program is better. Even 6.5-3.5 does not >allow it. And that's what we see all the time, even between programs that are >supposed to be of very different strength. > >So for all practical cases here, a 10 games match is not something I would >consider interesting. > >Of course it can be interesting to replay the games, but for different reasons. > > > > Christophe Yes, I see your point and I agree. For SMALL tournaments, exhaustive post-mortem analysis of the games may be the **only** way to obtain a significant amount of useful information from the tournament. Bob D.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.