Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Tournaments-Tournaments !

Author: Dann Corbit

Date: 00:30:53 11/20/99

Go up one level in this thread


On November 20, 1999 at 02:42:03, blass uri wrote:
>On November 20, 1999 at 00:37:32, Wayne Lowrance wrote:
><snipped>
>>At the moment, for me, the only gauge is SSDF. Its results seem to stand up
>>preety good. SSDF has said for a few years now that Fritz was the strongest
>>program
>
>The ssdf did not say it because they did not test all the programs.
>The ssdf is saying now that chessmaster6000 has better rating than Fritz on the
>same hardware and that tiger has clearly better performance than Fritz.

From what we have seen so far, all programs in the top 8 or so are peers in
ability.  In other words, within one single standard deviation of uncertainty
there is nothing to tell which is the stronger.  The mean value may be slightly
higher for some programs, but unless you play a bazillion games, there really is
not enough to separate them with mathematical certainty.

Within one standard deviation, you are really saying:
This program has an ELO strength *relative to this pool of peers* which has a
67% chance of being between the "+" mark and the "-" mark.  It takes *two*
standard deviations to be 97% sure.

IOW, the program that tops the SSDF is pretty much a crap-shoot.  On the other
hand, a higher x-bar is a real indication of strength.  It just is not as
certain as most people seem to think it is.




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.