Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Judge yourself!

Author: Ron Langeveld
Date: 15:46:15 07/31/02
On July 31, 2002 at 17:34:43, Maurizio De Leo wrote:

>On July 31, 2002 at 16:34:47, Gian-Carlo Pascutto wrote:
>
>>>Actually also #7 Junior is in the "confidence range".
>>
>>It isn't. (You can't simply add the error margins)
>>
>>sqrt (30^2 + 30^2) = 42
>>
>>Fritz 7 is with >95% confidence better than Junior 7.
>
>You are right.
>
>With the ipotesis that ssdf ranges are based on a standard normal distribuition
>
>Fritz
>average                       M1 = 2741
>medium square error           s1 = 15,306
>
>Junior
>average                       M2 = 2689
>medium square error           s2 = 14,796
>
>so Z = (M1-M2) / sqrt (s1^2 + s2^2) = 52 / 21.29 = 2.44
>
>and being this also a standard normal it leaves indeed way less than 5%
>probability that Junior has the same strenght of Fritz.
>
>Maurizio
>
>P.S.   Thank you for letting me take off a little rust from my math.
>P.S.2  So after all the SSDF list isn't so unuseful : it rules out two big
>pretender (Junior and Hiarcs) for the trone of best computer program.

An interesting thought! Without a doubt Fritz7 is a very strong engine. I use it
a lot. Compared to others it has a reliable evaluation except for a few endgame
positions with opposite bishops. From a pure statistical point of view Fritz7 is
very interesting. The results speak for themselves, however somehow i feel
pleased by the fact that some engines are not able to show these results as
well. Next to statistics there is this aspect of understanding crucial
positions. Where evaluations of the same postion seem to drift apart things
become really interesting. From my observations I can only conclude that these
positions often proof that engines like Diep, Shredder, Hiarcs are more capable
of smelling strategical errors by the opponent. Not that i want to diminish the
importance of tactics, every top program deserves full credentials in this
respect, and some people claim tactics determine 90+ % of the outcome, but even
if Fritz is the uncrowned king in this respect, these other engines surprise me
on more occasions with a dead-on evaluation in difficult positions than Fritz
does. You may call this unpredictability if you like, but there are other
aspects to consider besides statistics when you want to determine the best
computer program. Aspects that may seem less objective, but with regard to
objective measurements I don't hear a lot of people complain that engines use
different (quality) books. Sometimes I wish there was a seperate SSDF list that
is not compiled based on games but on a big collection of testpositions. If
these positions reflect an appropriate amount of positional bonusses than Fritz
will no longer top the ranks, that's my conviction.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.