Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dangers in CC - SSDF: Terminology, Statistics

Author: Richard Pijl

Date: 06:27:57 02/21/03

Go up one level in this thread



>
>So, for the actual SSDF list, the three first programs could each be number one.

In fact, it is the first 10 as you have to apply both error bars :-)

and From the SSDF rating list:

'The Swedish Ratinglist may be quoted in other magazines, but we insist that
this will be done in a correct way! We expect, that not only the rating figures,
but also the number of games and the margin of error will be quoted.'

It is questionable though if you can capture 'the best chess program' on any
type of one-dimensional rating list. This would imo only possible to call one
program the best if that program consistently beats _all_ others in long
matches.

Consider one program (a) that beats all others except one (b) in long matches.
As program (b) exploits a weakness in program (a) but is in itself not a very
strong program as it loses all its matches against the other programs.

Now we change the playing field: Instead of program (a), program (b) and a
number of others, we add a large number of programs similar to program (b),
meaning that they exploit the same weakness in program (a). Obviously, program
(a) loses many more matches than before and may no longer top the list of 'best'
programs.
Although this example is quite artificial, it does show that a choice of
opponents and the competition format has a large influence on the outcome of the
competition.

Conclusion: A rating list only presents an indication of playing strength. You
cannot conclude that a single program/player is the best based on statistics. It
does provide an _indication_ of playing strength though ...

Richard




This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.