Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CSS WM TEST - a technical view

Author: Vincent Diepeveen

Date: 10:35:03 06/15/04

Go up one level in this thread


On June 15, 2004 at 05:55:21, Franz Hagra wrote:

>1. looking at the used formula

Who cares for the rating formula.

The only problem with the test is it should be called "patzer test" and not "WM
test". It just measures how much of a patzer a program is. Not how good a
program is.

I wouldn't even *blink* with my eyes if current diep version scores *high* on
this test and when Hydra would top this list by hundreds of points ;)

>rating WM-Test = base 2600 + (2 x LQ) - [5 x (GZ : 100)]
>(where LQ= number of solved positions; GZ = solve time incl. penalty time of
>1200 for each unsolved position)

of course any test considering the solving time is flawed. It doesn't matter
whether you find something within 5 seconds or 60, as long as you find it within
tournament level.

Further all those tests just focus IMHO too much upon patzer moves towards the
opponent king.

Suppose you play a game 1.e4,e5 2.Qh5

That's a major patzer move towards the opponent king.

These tests give bonus for such type of positions.

Secondly another major flaw is that you *must* find certain positions and it
doesn't matter much whether you do not find others. A won position *keeps* a won
position.

See how shredder dominated last few years. And it didn't win those titles by
just patzer moves...

I do of course realize how beginners love patzer moves, so new testsets will
keep on getting produced by using patzer moves...

>and here the published Ratinglist (only Top 4 out of 230)
>
>X3D Fritz------2.711
>Gambit---------2.709
>Deep Fritz 8---2.704
>CM 9000--------2.702
>...
>
>Everyone who has only a little knowledge of measurement and significance knows
>that this is really a nonsence. In this we only have 2 significant figures and
>so the significant result of the test ist eg. not 2711 but 2700 (within the
>range 2650-2749) - other possible measurement data are 2600 and 2800
>
>So the correct WM Test Ratinglist is:
>
>1. 2700 former ranked 1-94 engines (here you find nearly all newer engines)
>2. 2600 former ranked 95-229 engines (amateur and older pro's)
>3. 2500 Queen 2.28 (UCI)
>
>Hagra



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.