Author: Andrei P
Date: 18:15:31 06/23/04
Go up one level in this thread
On June 23, 2004 at 09:25:07, martin fierz wrote: >On June 22, 2004 at 12:13:28, Manfred Meiler wrote: > >>Hello, >> >>at http://www.computerschach.de/test/index.htm there's an updated version of my >>Excel sheet with the detailed results of 260 engines in the test suite >>"Weltmeister-Test" (WM-Test) available for free download (1,5 MB). > >a lot has been said about the WM-test and it's usefulness (or lack thereof). >instead of all this talk, somebody (not me, i'm definitely not interested >enough...) could compile a list of ratings WM-test and ratings SSDF of the same >programs, for about 50 programs. in order not to bias the result, this selection >should be done in a well defined way, e.g. take the newest version of each >program (i.e. only fritz 8, not older versions). > >then we'd have a table looking like this: > >engine name WM-test-rating SSDF-rating >A 2xxx 2xxx >B 2xxx 2xxx >.... and so on. > >it will then be easy to compare the differences in this list, and to say how >well the WM-test can predict a rating (if we accept the SSDF list as the "real" >rating). remember, there should be a LARGE number of engines involved, not only >5 or 10, so that the statistics are more significant. > >that would be much more sensible than the discussions i have seen up to now on >this subject! > >if we get something like an error margin of +-50 rating points for the WM-test, >i would call that a success for the WM-test. you cannot expect a test suite >which can be run in a short time to produce perfect results. > >cheers > martin I think this is a very good idea to compare the SSDF and WM-test. only one should not look at the numbers per say, since WM was not designed to predict elo. but only on the ordering of engines by strength. if ssdf orders engines fritz8, junior8, crafty19-13 and WM orders them differently, then I would say WM is not a good predictor of a playing strength. but the absolute numbers are not relevant. In fact, if WM strength has a good correlation with SSDF strength (which it probably does) in the ordering of engines, then one can easily modify the original WM formula so that it, in fact, will have a good predictive strenth of SSDF elo. not sure why this type of data has not been collected and made public. Maybe I am just not aware of it.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.