Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CSS WM TEST - a technical view

Author: martin fierz

Date: 05:11:59 06/15/04

Go up one level in this thread


On June 15, 2004 at 05:55:21, Franz Hagra wrote:

>1. looking at the used formula
>
>rating WM-Test = base 2600 + (2 x LQ) - [5 x (GZ : 100)]
>
>(where LQ= number of solved positions; GZ = solve time incl. penalty time of
>1200 for each unsolved position)
>
>and here the published Ratinglist (only Top 4 out of 230)
>
>X3D Fritz------2.711
>Gambit---------2.709
>Deep Fritz 8---2.704
>CM 9000--------2.702
>...
>
>Everyone who has only a little knowledge of measurement and significance knows
>that this is really a nonsence. In this we only have 2 significant figures and
>so the significant result of the test ist eg. not 2711 but 2700 (within the
>range 2650-2749) - other possible measurement data are 2600 and 2800
>
>So the correct WM Test Ratinglist is:
>
>1. 2700 former ranked 1-94 engines (here you find nearly all newer engines)
>2. 2600 former ranked 95-229 engines (amateur and older pro's)
>3. 2500 Queen 2.28 (UCI)
>
>Hagra

in principle your criticism is probably right, but your solution is wrong.

you should quote the error, and not round results. you can still measure to your
best ability, and give the number you measured. therefore it should be

X3D Fritz------2.711 +-E1
Gambit---------2.709 +-E2
Deep Fritz 8---2.704 +-E3
CM 9000--------2.702 +-E4

but one should add those error estimates (E1 etc). i have no idea how to come up
with these estimates, but you seem to believe they are around 100 rating points.


cheers
  martin



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.