Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Some stats...

Author: Dann Corbit

Date: 19:09:00 01/23/04

Go up one level in this thread


On January 23, 2004 at 20:58:51, Bob Durrett wrote:

>On January 23, 2004 at 20:37:02, Christophe Theron wrote:
>
>>On January 23, 2004 at 14:31:59, Bob Durrett wrote:
>>
>>>On January 23, 2004 at 14:20:43, Christophe Theron wrote:
>>>
>>>>On January 23, 2004 at 07:08:07, Kolss wrote:
>>>>
>>>>>On January 22, 2004 at 12:53:16, Christophe Theron wrote:
>>>>>
>>>>>>On January 21, 2004 at 20:00:12, Kolss wrote:
>>>>>>
>>>>>>>Hi,
>>>>>>>
>>>>>>>How many games you need depends on what you want to show, of course... :-)
>>>>>>>If my calculations are correct, I get the following:
>>>>>>>
>>>>>>>Shredder 8 vs. Shredder 7.04:
>>>>>>>
>>>>>>>+90 -65 =145
>>>>>>>
>>>>>>>=> 162.5 - 137.5
>>>>>>>
>>>>>>>=> 54.17 %
>>>>>>>
>>>>>>>=>
>>>>>>>Elo difference = +29
>>>>>>>95 % confidence interval: [+1, +58]
>>>>>>>
>>>>>>>That means that based on this 300-game match (for this particular time control
>>>>>>>on this particular computer with these particular settings etc.), your best
>>>>>>>guess is that S8 is 29 Elo points better than S7.04 (highest likelihood for that
>>>>>>>value); there is a 95 % chance that S8 is between 1 and 58 Elo points better;
>>>>>>>and the likelihood that S8 is (at least 1 Elo point) better than S7.04 is 97.5
>>>>>>>%.
>>>>>>>
>>>>>>>So if you "only" want to show that S8 is better, you can - statistically
>>>>>>>speaking - stop now. If you want to "prove" that it is more than 20 Elo points
>>>>>>>better, you need a few more games indeed...
>>>>>>>
>>>>>>>Best regards - Munjong.
>>>>>>
>>>>>>
>>>>>>
>>>>>>It's great to see that at least one guy is able to correctly interpret match
>>>>>>results here.
>>>>>>
>>>>>>I hope you will post more often on this subject. Information on it is very much
>>>>>>needed here.
>>>>>
>>>>>Well, as my former English teacher used to say:
>>>>>
>>>>>"I'm talking to the trees - but they aren't listening to me..." :-)
>>>>>
>>>>>I guess some people just don't bother trying to consult a *basic* statistics
>>>>>book before jumping on you... ;-)
>>>>>
>>>>>Best regards - Munjong.
>>>>
>>>>
>>>>
>>>>Please don't leave the forum and help me educate people! :)
>>>>
>>>>Actually people do not need to understand all the maths behind the stats (I
>>>>don't myself), but just to understand a few basics. For example that a 10 games
>>>>match tells mostly nothing.
>>>>
>>>>
>>>>
>>>>    Christophe
>>>
>>>Imagine yourself playing a 10 game rated match against one of your peers
>>>[someone who sneers and blows smoke in your face] and suppose you lost all ten
>>>games?  You would then think the match meant a lot!  One step away from that
>>>would be when the match were played between your chess program and someone
>>>else's.  Your program would be your "pride and joy" and would, in effect, be
>>>your surrogate.  I imagine that it would be hard to accept the idea that a ten
>>>game loss would be insignificant.  It's great to be able to stand back and see
>>>things objectively, of course.  Generally, I feel that SOME information is
>>>provided by every tournament or match no matter how few games are played.  I
>>>agree in principle, however, that a 5 1/2 to 4 1/2 result in a ten game match
>>>would offer little insight into the current playing strengths of the players.
>>>
>>>Bob D.
>>
>>
>>
>>Your last sentence is what I had in mind. 5.5-4.5 as we see so often is not a
>>result that allow us to decide which program is better. Even 6.5-3.5 does not
>>allow it. And that's what we see all the time, even between programs that are
>>supposed to be of very different strength.
>>
>>So for all practical cases here, a 10 games match is not something I would
>>consider interesting.
>>
>>Of course it can be interesting to replay the games, but for different reasons.
>>
>>
>>
>>    Christophe
>
>Yes, I see your point and I agree.
>
>For SMALL tournaments, exhaustive post-mortem analysis of the games may be the
>**only** way to obtain a significant amount of useful information from the
>tournament.

But they can be just as fun as the big, long-lasting ones.

Consider the WMCCC.  It proves nothing, but everyone here will be on pins and
needles while it is running (including me).

On the other hand, sometimes declaring a champion is just what is wanted.  Not
the same thing as finding out "who's best" of course, but still interesting.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.