Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Nunn test is good!!

Author: Ratko V Tomic

Date: 11:27:08 10/06/99

Go up one level in this thread


Good points about learning. For a reproducable and fair test with
learning on one would need to symmetrize the sequence of matchups,
by reinstalling the programs from scratch and permuting their order.
That would take too much time.

> Obviously to say "20 games are enough, i don't care about
> statistic-theory" shows that you are kind of , uhm, "special".

I think this is a bit harsh judgment. Namely a typical game may have
over 50 moves where program's judgment is on the line, so you have
really a test suite of over 1000 positions on which you can look at the
quality/strength of the program's "thinking".

Most of us here, when we get a new chess program, play right away several
games against it (usually after running the program through a handful of
our favorite test positions, we tested other programs with), and just
within couple games we have a pretty good idea on how strong the program is.
After all, by that point we have had several hundreds of positions we thought
through in depth along with the program, with immediate comparisons & feedback
of our judgments. In real life situations calling for evaluations (be it in
hiring someone for a job or marriage, or buying a car), you usually have much
smaller sample of situations after which you need to issue a judgement.

Technically small samples of games are still Ok if examined intelligently.
Only a mechanistic/blind point counting method of testing needs large
samples, since out of the whole game it keeps under 2 bits (1.5849..) of
information (from a game which has thousand or more bits of information).




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.