Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: WM Test Position 1 - another solution found ENGLISH with explanation

Author: Mike S.

Date: 05:49:21 06/10/04

On June 09, 2004 at 19:05:40, Rolf Tueschen wrote:

>(...)

>Now Hagra, an anonymous author with a good knowledge of statistics and chess has
>made the strongest critic against the test that I know of. He basically doubts
>that a chess position from real life chess can test a machine because it is
>difficult to decide why the machine has adopted a specific continuation.

Actually this would be a critizism of *all* test suites, because all of them
follow the same concept: (Simply) *find the move* (but not, find the move and
give me perfect explanation/evaluation/analysis of why it is best).

So, if that critic would be valid, it would fit not only to the WM Test but
obviously to all test suites. So, all authors and users of test suites didn't
spot that fatal mistake yet, since so many years...? :-)) I don't claim that one
single man cannot have the unique and correct ideas while all others are wrong,
sometimes, but such "Galileo" cases (and also "Leonardo" cases :-)) are very
very seldom.

>(...) The also
>here known author Michael Scheidl assisted in that fight.

Thanks for adressing me as a known author. So it seems that I have at least
achieved a bit (it was a lot of hard work! :-))

>IMO the whole
>argumentation is unfair because if already a general critic is sound and comes
>to a negative judgement then the practical argument has no more sense at all.

How do you see *if* such critic is sound for a specific position? There's no
other choice than to analyse chess-wise (!) and illustrate the critic with
variantion etc. The latest provided PGN is a good example of how such critic
should be presented. - The "general" critic is not generally valid, because i.e.
test authors do of course usually choose typical test-like (or test-fitting)
moves to ensure, or to make it most likely at least, that they won't be chosen
for wrong reasons without proper understanding. Typical are *sacrifices* no
engine would choose "just for fun" so to speak. You don't waste a rook without
seeing the gain. Of course you can't, or in general shouldn't, use positions for
a test where the solution move is a normal boring 08/15 move which doesn't allow
any conclusion what it is played for, in itself. Below I give two examples from
my Quicktest to illustrate this. Tell me if you find engines which play these
solutions for the wrong reasons. :-)

(For a *very big* test, 1000+ positions, the above could eventually be ignored
because when an engine A i.e. plays strong moves in 867 out of 1000 and engine B
in only 427, it will always tell a lot about the A/B analysis power relations
disregarding the "correct reason" question.)

Did you take a look a the currently available WM Test results of 230 (!!)
engines yet? You'll find the

known strong engines at the top,
medium engines (good amateurs, older profis) in the medium ranks
and weaker engine at the end of the ranking list.

(With only very few exceptions or "surprising" rankings.)

How would you explain these results when that whole test (and -method) wouldn't
be valid?? Is it wizardry? :-)

>(...) it is also
>possible to play Rad8 and way later Re3 instead of the test solution Re3.

But Rad8 has no forcing character IMO and threats nothing special. Re3 issues
the strong threat Rxg3. (Just an observation. - I hope Mikhail will add some
comments about this.)

>All
>who know details about tests know that the fact of a second solution decreases
>the value of a test position.

True - but only when there really is a second solution of *almost the same
strength*; I think alternatives which are clearly weaker are not a problem
because it is the challenge to find the *best* move and not just a good move,
*in analysis* (it could be discussed if that is different in practical games).
Of course, perfectly clear positions whithout alternatives, i.e. only move X
draws and all others lose, are preferable.

>But the readers shouldn't forget that the main problem for such engine tests is
>the finding of positions which allow to test what the test founder pretended.
>Here the WM Test allegedly can test the ability to analyse. (...)

It can, because

1. we know the good difficult continuations of the test postions, and
2. during the test run, engines have to analyse these.

So the engines analyse and we can compare and judge about the analysis results
(at move #1). Basically that's not different from the way 99.9% of all test
suites are done.

>In other words, the academic doctor MG claims a deeper meaning
>with his test but in reality he has put together these 100 positions without
>showing the validity of the positions for his own insinuations into the test!

IMO the question of validity has to be answerd chess-wise in the first place,
and that MG has done by giving solution variants, subvariants and comments in
the data provided with the WM Test package. Nevertheless, many positions require
some studying of the user to be convinced of and/or to understand the solution.
Some I found very difficult and had my doubts too. This may be strength
dependant (it may be more clear for stronger players - and for engines - often).
Nowadays tests just have to be very difficult to reveal any significant
differences between strong engines.

Anyway, it will always be best to discuss test positions and -suites based on
direct chess-wise analysis etc. which may supported (or started) by engine
output only. This will lead to constructive dialogue and fun with chess itself.
One point of the discussion was that strange observations of engine output
*only* are not sufficient to base valid critizism on, and if you look at the
last critic issued, it seems that there is consensus about this :-) maybe with
the exception of you (?).

mfg.
Michael Scheidl

[D]1n1r1rk1/ppq2ppp/3p2b1/3B1NP1/4PB1R/bP2P2P/P1P5/3KQ1R1 w - - 0 1
1.Qc3! (Quick-01)

[D]3Q4/3p4/P2p4/N2b4/8/4P3/5p1p/5Kbk w - - 0 1
1.Qa8! (Quick-03)

Re: WM Test Position 1 - ENGLISH Testvalidity of "WM-Test" Part II Rolf Tueschen 13:41:29 06/10/04
- Re: WM Test Position 1 - ENGLISH Testvalidity of "WM-Test" Part II Mike S. 16:22:34 06/10/04
  - Re: WM Test Position 1 - ENGLISH Testvalidity of "WM-Test" Part II Rolf Tueschen 17:56:32 06/10/04
    - Re: goodbye, thanks & farewell (was: WM Test ...) Mike S. 07:33:03 06/11/04
      - Re: goodbye, thanks & farewell * We will see us again in old Friendship! Rolf Tueschen 12:29:10 06/11/04
privat to M.Scheidl Franz Hagra 13:40:26 06/10/04
- Re: privat to Hagra (but not secret :-)) Mike S. 14:18:51 06/10/04
  - Re: privat to Hagra (but not secret :-)) Franz Hagra 14:38:49 06/10/04
    - Re: privat to Hagra (off topic) Mike S. 16:35:08 06/10/04
      - Re: privat to Hagra (very off topic) Franz Hagra 16:43:41 06/10/04
        
        The Meaning of Criticism Rolf Tueschen 17:46:45 06/10/04
        
        Re: privat to Hagra (thread should be closed here) Mike S. 17:16:50 06/10/04
        
        last word Franz Hagra 17:21:37 06/10/04
Re: WM Test Position 1 - ENGLISH - Censorship in German CSS Forum Part I Rolf Tueschen 13:05:28 06/10/04
- thanks for the help Franz Hagra 13:26:09 06/10/04
  - Re: thanks for the help Rolf Tueschen 13:55:14 06/10/04
    - my answer to "das Genie schlägt zurück" (GERMAN OFF TOPIC) Franz Hagra 14:11:52 06/10/04

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.