Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Chessfun and Nunn1 Tests

Author: Eelco de Groot

Date: 16:43:52 05/10/00

Go up one level in this thread




 Why the Nunn test

>A common book, or a special book for each engine, is to be preferred, since
>it's the strength of the program we're interested in,

That was just exactly my argument, the strength of the program as a whole (for
the sake of simplicity I'll consider that now as a combination of engine+timing
algorithm+book+learner) was not really what Chessfun and Christophe a.o. were
interested in. At least that's how I understand it. Also because to measure that
you would a. need a lot more games because of the noise introduced by opening
books and learners and b. it's not very significant because only the two
programs, Crafty and Fritz were tested. That's rather a small pool. If that had
been the object I'm sure Chessfun would have let more programs play.

Like I said I saw the object more to look at the influence of a. timecontrol and
b. pondering on the strength of a typical program but since these (a. and b.)
mainly influence the combination of 1. engine and 2. timing algorithm it pays
off to limit the influence of 3. book openings and 4. learners. Hence the Nunn
test. But other opening positions, for instance ones present in both books like
Christophe suggested, or early middlegame positions would have served too.
Jeroen Noomen did also prepare a set of reasonably balanced opening positions,
if somebody would want to carry out more tests like this I'm sure Jeroen would
want to e-mail them to interested parties. It's the principle involved, not the
particular positions.

 Attempt at an analogy

It's a bit difficult to find a good analogy but for myself I see it a bit like
this: It's something like throwing a set of dice. (Thorsten would call it
throwing bones but then my analogy doesn't work anymore) The dice are all
weighted to a degree and you want to find out how much. One of the dice is
weighted by engine strength and timing algorithm,-like I said, it's not a very
good analogy-, and turning on or off pondering and changing timecontrol may
change the weighting of this dice. But there are other dice too and they stand
for learners and opening books. You assume that changing timecontrol or
pondering has little effect on their contribution to the result. The problem is
that this result consists of the total of points of all the dice thrown
together, you can't look at the individual dice. Using normal dice gives random
results, an analogy might be a program that plays random moves. What I am trying
to say is that including "opening book-dice" and "learner-dice" introduces
basically a lot of noise for the object of the experiments. It's just my opinion
that in practice there are a lot of these "opening-dice" and "learner-dice"
involved. You need a lot more throws to filter out the effect of pondering and
timecontrol which is what we were interested in.


 Nunn positions may favour one of the programs

You could argue that because there are only a limited number of starting
positions that the program might never play with its own book this might
disadvantage one of the engines. True but if you think about it that is of
course irrelevant for what you wanted to find out here.

If the object was just a simple match to determine which program is strongest
this is a factor but then the question raised is how vulnerable is a
chess-program in unfamiliar positions. How good can a program be if it can only
play a limited number of openings well. Personally I'm more interested in doing
some analysis with a program and not so much in playing whole games. So I would
rather have a program that is not too dependant on its openings. Customers
demand wide books for commercial programs so that is also a factor. Small books
made even smaller by the learners may give better matchresults but are not very
attractive to the customer. But I'm digressing.

 Autoplayer

You brought up that the autoplayer can also be a disturbing factor and that is
true of course. I didn't read the all the messages so I don't really know if
indications of autoplayer problems came up in the threads.

So I basically tried to ilustrate my reasoning about the downside of learners
and opening books and I hope you can follow my argument a bit.
I wouldn't know if feelings got hurt, I'm sure you didn't mean to do that.
Surely Chessfun isn't discouraged so easily!

On a different subject, I think Jan Timman showed good foresight by going to
Bali and not play In the Dutch Championship. Not necessarily because of the
weather though, it's very good here! And not that it isn't interesting sofar but
I don't think the sponsors are very happy with the controversies either. Not
really any fault of Fritz or Frans though.

Regards, Eelco



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.