Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Criteria for Good Test Positions = ?

Author: Mike S.

Date: 14:19:19 12/09/02

Go up one level in this thread


On December 09, 2002 at 09:08:28, Bob Durrett wrote:

>(...)
>Suppose a suite of test positions, each of which contained the positional
>features of interest, were used.  Suppose also that the engine came up with the
>right answer for each and every position.
>
>Would it really matter why the engine came up with all the right answers?

It would depend on the "degree of doubtfullness" of the positions each. It
matters the less, the less doubtful it is, if the engines has played the
solution for the intended correct reason or not.

With positional tests, it's IMO very difficult - if not impossible - to achieve
a good testing character as described in requirements (c) and (d).

But OTOH, I wouldn't overrate the problem. With a very big number of positions,
similar to Ingo Lindam'S suggestion (but better not randomly chosen), chances
are good that the majority of solutions is ok. At least I would expect that only
a minority of positions is affected by the problems mentioned. But again, it all
depends on the quality of the positions, how "suitable" for tests they are.

>Maybe it came up with all of its answers "for all the wrong reasons."
>
>But if the engine solves every positional test position that the humans can
>throw at it, wouldn't it be safe to say the engine can play positional chess?

Basically this is perfectly true, when we talk about moves played in a *game*.
In a game, the moves played are what matters only, and not why and how they were
evaluated and if the pv made sense, etc. - That's part of the reason why I think
a test position should focuss on the first move of the solution and nothing else
- but that OTOH requires the testing character mentioned.

To make it short, a perfect test position is where you only need to look at the
first move and bingo! that's it.

In suboptimal and bad positions, which lack the (c)+(d) requirements, you'll
have to look at the evaluation, main variant etc. too, or even see how it would
acutally continue to play (if the engine really plays like inteded...) IOW, you
have to create an in-depth expert witness'es opinion for each engine's result on
each position. This is very uncomfortable and will probably take so much time,
you won't be through before the program ist outdated. :o) It is better to have a
suite which can produce reliable results quickly.

I completely quit searching for good *positional* tests some time ago, and
focuss on tactics mainly. Some endgame things (í.e. specific knowledge cases)
should also be fairly easy to test in that way, too.

But for positional judgements, I would recommend to get the impression from
games the engine has played, rather than from positional suites (where you'll be
probably too busy checking if you can trust the positions or not, before it
makes sense start to tests...).

Regards,
M.Scheidl



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.