Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: How to use a [cough] EPD test suite to estimate ELO

Author: Peter Fendrich

Date: 13:50:38 02/11/99

On February 11, 1999 at 15:41:25, Dann Corbit wrote:

>Andreas Schwartmann asked an interesting question in r.g.c.c.:
>"I wonder if anyone can enlighten me on how to use various test suites, like
>LCT, LCT II and Covax. There are ceratin formulas on how to calculate the
>playing strength according to these test suites, right?"
>
>Now, ignoring the fact that they are full of bugs and the measures are probably
>bogus, how *does* one arrive at an ELO from a test suite evaluation?
>
>What is the actual mathematical basis for the calculations?

I don't think it's possible. I've never believed in tests like that.
They are fun and curious but not useful for meassuring strength:

1) There will be a tendency to specialize against these tests. I don't mean
   cheating (of cource an issue as well) but the fact that you will keep
   versions doing well on the test but maybe not get any better results in
   games.
2) A programs strength doesn't only depend on its ability to find winning
   moves. The path to the winning position is even more important. The ability
   to avoid bad moves especially in quiet positions, the ability to play
   consistently between different phases of the game (like leaving the opening
   and knowing what to do or trading down to endgames knowing it didn't go
   from a winning position to a dead draw.)
3) All the, in theory, known positional and tactical "themes" are quite a few
   and all the combinations possible to do with these "themes" is a huge
   number. A program should be able to handle most of these combinations.
   Then add "themes" not described in literature.

If it's possible to build a fair test suite it would consist of a huge number of
positions that would take an eternity to walk through.

Despite all that, the formula for this huge test set would be interesting to
know anyway!

My suggstion to start with would be something like the tests Bent Larsen made.
Different moves gives different score. This is of course more complicated to
build but it mirrors the situation in a real game better. I don't think it is a
good idea to count in the time it took for solving. I have feeling that a fixed
time for the whole set is better than a fix time for each position.

//Peter

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.