Author: Christophe Theron
Date: 10:56:59 03/20/00
Go up one level in this thread
On March 20, 2000 at 06:46:37, Bertil Eklund wrote:
>On March 19, 2000 at 22:26:49, Christophe Theron wrote:
>
>>On March 19, 2000 at 15:41:30, Bertil Eklund wrote:
>>
>>>
>>>Hi!
>>>
>>>A very impressive result from Shredder4.
>>>
>>>IMO Shredder plays positionally very good and excellent in the endgames.
>>>Nimzo is a bit stronger tactically.
>>>
>>>Shredder4 used all 4 Turbo-CDs.
>>>
>>>Bertil
>>
>>
>>Bertil,
>>
>>I am not sure this message is going to be well accepted. So let me first state
>>that I have the greatest respect for your work and the SSDF.
>>
>>Let me also state that I have a lot of respect for Nimzo, Shredder, and their
>>respective authors.
>Yes it's all great programs.
>
>
>>However, I can only strongly disagree with your sentence "a very impressive
>>result from Shredder4".
>
>57,5% against a program known as one of the best on tournament time-control
>impressed at least me. I only talk about this 40 game match. Maybe it loses to
>Tiger in the next match but it's another match.
Maybe Tiger loses, actually I do not know.
But 57.5% must be taken with a statistical grain of salt. From the statistical
data I have, and I'm open to discussion about this, on a 40 games match you can
expect the error margin to be +/- 8.0% if you want 80% confidence.
So 57.5% on 40 games means that you are 80% sure that the result is somewhere
between 49.5% to 65.5%. That is somewhere between a draw and a crush, hard to
say more.
If you want 95% confidence, the error margin is much higher, and in this case
interpreting the result is almost impossible (we would end up saying that the
result is between "Nimzo crushed Shredder" and "Shredder crushed Nimzo").
>As earlier told I was very impressed of Tigers play against the competition, the
>results and the play, except in the match against F5.32, not only that Fritz won
>but Fritz "played better" showed some ideas and could have won much more
>convincing if it not in several games misjudged some exchanges down to the
>endgame (Uris favourite-thesis)and of course excellent defence from Tiger.
So Fritz played better then played worse... So in the end... :)
>>You have played a 40 games match. Under these conditions, and given the result
>>(23-17 in favor of Shredder) it is absolutey impossible to say with a 95%
>>confidence that Shredder is stronger than Nimzo. It is not even possible to say
>>it with a 80% confidence.
>
>This isn't the list with 300+ games only a single event and a subjective(IMO)
>commentary.
Yes, I understand that.
>>So saying that it is "a very impressive result from Shredder4" is, to say the
>>least, very far stretched. Unless you were assuming that Shredder was weak, but
>>you weren't, were you?
>I have seen some not to impressing results from Shredder4, at least my matches
>against Hiarcs7.32 and Nimzo are better than the results from Cadaques
>tournament.
>>I know my remarks here could be interpreted as bad taste from me. I just want,
>>as I have done several times in the past, introduce a little bit more of good
>>sense in the interpretation of results.
>
> Yes this is very good, thank you!
>
>>I have already seen people claiming that program X was better than program Y
>>because X won against Y by 6-4 in a 10 games match. This is pure nonsense, of
>>course (well I say of course, but do people know why?).
>
>Some people only need one or two games to be sure of a programs strenghts and
>weaknesses!
My post was written for these people actually. Not really for you, because I
know you are aware of these mathematical details. :)
>>Similarly, a 23-17 result is not significant (at least not significant enough to
>>qualify it as an "impressive" victory), unless you are willing to take a big
>>risk in your statement. That's exactly why the SSDF insists that program
>>rankings are published together with the intervals of confidence computed for
>>these rankings.
>
>My article is signed Bertil, not Bertil SSDF and also I mentioned IMO.
>Therefore I took the chance to give some impressions from the match. I have gone
>through every game, no anomalies so far. Two stops only (82 games)
>Hiarcs stopped moving within the book and in one game S4 stopped after one move,
>strangely saved as a win for Shredder4, Nimzo saved it as a draw (unsuspected
>end)
OK.
>>This is not to say that your match is not significant. This is not to say that
>>Shredder is not stronger than Nimzo (I do not know actually, I don't even own
>>these programs). When this result will be added to other matches played by
>>Shredder and Nimzo, we will get a much better picture (and a much better
>>confidence) about the respective strength of these programs.
>>
>Yes the list is scheduled for March.
>>
>>I hope my remark will not be interpreted negatively. I think the topic of
>>confidence intervals on match results should deserve much more attention from
>>the computer chess enthousiasts, and I regret that it is not discussed more
>>often here, on CCC.
>
>You have already proved that you are very sportsmanlike and fair, when Junior6
>"took" over the lead partly because of a book-error.
>All fair critizism is welcomed!
Given the result of Fritz6 against Tiger, I'm currently considering giving away
my sportmanlike attitude. :)
Christophe
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.