Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: comparison with SSDF?

Author: Andrei P

Date: 18:15:31 06/23/04

Go up one level in this thread


On June 23, 2004 at 09:25:07, martin fierz wrote:

>On June 22, 2004 at 12:13:28, Manfred Meiler wrote:
>
>>Hello,
>>
>>at http://www.computerschach.de/test/index.htm there's an updated version of my
>>Excel sheet with the detailed results of 260 engines in the test suite
>>"Weltmeister-Test" (WM-Test) available for free download (1,5 MB).
>
>a lot has been said about the WM-test and it's usefulness (or lack thereof).
>instead of all this talk, somebody (not me, i'm definitely not interested
>enough...) could compile a list of ratings WM-test and ratings SSDF of the same
>programs, for about 50 programs. in order not to bias the result, this selection
>should be done in a well defined way, e.g. take the newest version of each
>program (i.e. only fritz 8, not older versions).
>
>then we'd have a table looking like this:
>
>engine name     WM-test-rating    SSDF-rating
>A               2xxx              2xxx
>B               2xxx              2xxx
>.... and so on.
>
>it will then be easy to compare the differences in this list, and to say how
>well the WM-test can predict a rating (if we accept the SSDF list as the "real"
>rating). remember, there should be a LARGE number of engines involved, not only
>5 or 10, so that the statistics are more significant.
>
>that would be much more sensible than the discussions i have seen up to now on
>this subject!
>
>if we get something like an error margin of +-50 rating points for the WM-test,
>i would call that a success for the WM-test. you cannot expect a test suite
>which can be run in a short time to produce perfect results.
>
>cheers
>  martin


I think this is a very good idea to compare the SSDF and WM-test. only one
should not look at the numbers per say, since WM was not designed to predict
elo. but only on the ordering of engines by strength. if ssdf orders engines
fritz8, junior8, crafty19-13 and WM orders them differently, then I would say WM
is not a good predictor of a playing strength. but the absolute numbers are not
relevant.

In fact, if WM strength has a good correlation with SSDF strength (which it
probably does) in the ordering of engines, then one can easily modify the
original WM formula so that it, in fact, will have a good predictive strenth of
SSDF elo.  not sure why this type of data has not been collected and made
public. Maybe I am just not aware of it.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.