Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: comparison with SSDF?

Author: Rolf Tueschen

Date: 06:48:41 06/23/04

Go up one level in this thread


On June 23, 2004 at 09:25:07, martin fierz wrote:

>On June 22, 2004 at 12:13:28, Manfred Meiler wrote:
>
>>Hello,
>>
>>at http://www.computerschach.de/test/index.htm there's an updated version of my
>>Excel sheet with the detailed results of 260 engines in the test suite
>>"Weltmeister-Test" (WM-Test) available for free download (1,5 MB).
>
>a lot has been said about the WM-test and it's usefulness (or lack thereof).
>instead of all this talk, somebody (not me, i'm definitely not interested
>enough...) could compile a list of ratings WM-test and ratings SSDF of the same
>programs, for about 50 programs. in order not to bias the result, this selection
>should be done in a well defined way, e.g. take the newest version of each
>program (i.e. only fritz 8, not older versions).
>
>then we'd have a table looking like this:
>
>engine name     WM-test-rating    SSDF-rating
>A               2xxx              2xxx
>B               2xxx              2xxx
>.... and so on.
>
>it will then be easy to compare the differences in this list, and to say how
>well the WM-test can predict a rating (if we accept the SSDF list as the "real"
>rating). remember, there should be a LARGE number of engines involved, not only
>5 or 10, so that the statistics are more significant.
>
>that would be much more sensible than the discussions i have seen up to now on
>this subject!
>
>if we get something like an error margin of +-50 rating points for the WM-test,
>i would call that a success for the WM-test. you cannot expect a test suite
>which can be run in a short time to produce perfect results.
>
>cheers
>  martin


You have it _all_ wrong!

a) WM-CSS-Test authors do NOT claim that they could predict playing strength

b) SSDF listing is invalid

c) both "tests" can't produce Elo numbers, but here CSS-WM test is the bigger
hoax because of the 2600 number in its formula that allone insinuates Elo
realities

The only common is that both "tests" give their users much fun and to ChessBase
and CSS a lot of profit (however small it is).



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.