Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dangers in CC - SSDF: Terminology, Statistics

Author: Rolf Tueschen

Date: 12:38:05 02/21/03

Go up one level in this thread


On February 21, 2003 at 09:27:57, Richard Pijl wrote:

>
>>
>>So, for the actual SSDF list, the three first programs could each be number one.
>
>In fact, it is the first 10 as you have to apply both error bars :-)
>
>and From the SSDF rating list:
>
>'The Swedish Ratinglist may be quoted in other magazines, but we insist that
>this will be done in a correct way! We expect, that not only the rating figures,
>but also the number of games and the margin of error will be quoted.'
>
>It is questionable though if you can capture 'the best chess program' on any
>type of one-dimensional rating list. This would imo only possible to call one
>program the best if that program consistently beats _all_ others in long
>matches.
>
>Consider one program (a) that beats all others except one (b) in long matches.
>As program (b) exploits a weakness in program (a) but is in itself not a very
>strong program as it loses all its matches against the other programs.
>
>Now we change the playing field: Instead of program (a), program (b) and a
>number of others, we add a large number of programs similar to program (b),
>meaning that they exploit the same weakness in program (a). Obviously, program
>(a) loses many more matches than before and may no longer top the list of 'best'
>programs.
>Although this example is quite artificial, it does show that a choice of
>opponents and the competition format has a large influence on the outcome of the
>competition.
>
>Conclusion: A rating list only presents an indication of playing strength. You
>cannot conclude that a single program/player is the best based on statistics. It
>does provide an _indication_ of playing strength though ...
>
>Richard


Do you talk to me? I have no problem with what you say. Perhaps others begin to
uderstand now. You could give SSDF interesting ideas for a better testing. - But
you avoided the question if Shredder is now the justified number one. What do
you think.

Rolf Tueschen



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.