Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Proposal: New testing methods for SSDF (1)

Author: Dirk Frickenschmidt

Date: 09:57:07 04/13/98

Go up one level in this thread


Hi Jeroen,

first of all congratulations for one of the most substantial post on a
critical issue I have read here since long!

Both of the testing methods you propose are worth thinking of.

1. Concerning the latter with 50 opening positions I think this way of
testing is most interesting. But I see 3 problems:

a) 50 positions are too many (though this number is shurely adequate for
the variety of modern chess), consuming *much* too much time, having to
play 100 games in each match program versus program. I think the limit
for the practicability for this kind of test are 20 positions. I know it
will be hard to find a good, representative testset then.

b) The 50 positions you posted here (which I regard as very generous
from you as a professional tester and bookwriter, not just posting the
results of a hobby) are certainly well chosen.
The only problem is that they - like many of the Nunn positions - end
too soon compared to modern opening books (in computers as well as in
human brains).

In fact I think this does not mirror the strength of modern programs
which normally are in book at least around move 12-15, and in some cases
(if this makes sense is another question) up to much higher move
numbers.

So I think a compromise would be useful: taking a set of slightly more
developed positions (kings castled and substantial pawn structures/piece
places for the beginning middlegame on board: I guess this would mean
about three or four moves more in average than your average was). Again
I know this is not easily done: the more specific the chosen positions
are, the more important it becomes that your choice still covers
*relevant* positions, the games and results of them giving some insight
into what playing strength and style the programs offer.

c) For me as a user it would be a pity and a real drawback seeing none
of the SSDF games (still seeing few of them anyway)! What interests me
most are often not the pure results but to see *how* a program plays a
given position against a certain opponent.

2. Concerning your first method, although it is perhaps not as
attractive as the latter, I think in general it would be easier to
handle for the SSDF.
(Just by the way :-) One question will be: will they finally kill the
doubles as you and I and others have been hoping for since long? Will
they finally admit at all that something like a double can be defined?)

But even in this case there remain problems to be solved:

a) which kind of database compiled by whom should be taken as basis?

b) How will SSDF testers, most of whom are until today not able to save
their test games in a common format and publish the games, ever be able
to handle the technical aspects of this procedure (having to have some
kind of extra porgram play out an opening choice by chance and then
setting up this position on both computers and still getting all the
more or less working auto232s to do their job as required? Or converting
the big book into all available formats witjout the help of the
programmers? How can it be done?)

c) where will the openings be cut off and with which kind of strange
effects in different openings (I observed some of these problems in the
Fritz5 powerbook which is *very* broad but in some variations not as
deep as human theory or some computer books)?

3. Concerning the auto232 device.

I had the opportunity to use the chessbase autoplayer and observe the
results. I noticed no special effect of it at all and have come to the
conclusion that it works just like any well known auto232 device except
for the nice feature that it switches between white and black games (so
you don't have to play a whole series with one colour before using the
other). Until now I have never seen any effect that makes be think of
something manipulative in it. And, frankly, I am convinced that someone
like Matthias Wuellenweber would never try to use such a technical
device as a kind of cheating device even if that would be technically
possible (I still have not yet heard any plausible argument concerning
this possibility).

The main problem are the more and more absurd "book wars" of which you
have been a victim yourself at times.

Although I don't like the Chessbase reaction as a user, I must admit I
understand it from Chessbase's view: they simply want to avoid the new
kind of killing book (I call it like that no matter what others think of
it) where pre-played autoplayer games become part of a new book which
then plays these wins as "openings" against the chosen targets in the
SSDF list.

As far as I know this is the only reason why chessbase refuses to make
their autoplayer available for everybody: it seems to be no secret
cheating device, but a simple auto232 player preventing to be booked by
others (not by you, as I know from your fair and attractive way of book
programming).

Perhaps there are solutions for these problems?

Your innovative article will shurely encourage others as well as me not
to give up too easy looking for some.

Thanks and kind regards
from Dirk



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.