Computer Chess Club Archives


Search

Terms

Messages

Subject: Serius lack of good test metodogy

Author: Cesar Contreras

Date: 00:28:36 06/20/04


I have a serius problem testing improvements in my chess engine.

I use test suites to find bugs, and stablish comparisions between diferent
versions of my engine. I know that i can't use test suites to calculate ELO,
that's clear to me.

But i think there can be a test that gives me an aproximation (with an aceptable
error margin) of the comparative strength of diferent versions of my engine with
diferent modifications. I don't plan to cheat, that's why i'm not going to try
to tune my engine to solve the test.

I think i can make an analogy of IQ tests, there are several aspects evaluated
in IQ tests, and diferent intelligences. Or a psicological test, that evaluate a
lot of things about anybody. Both have an error margin and are affected to
cheats, enviroment, time, sickness, etc. But it's up to the person who runs the
test to try to avoid such problems.

That test could give not only strength comparision, but some info, like
performance in openning, middlegame, endgame, or maybe something more specific
like degree of care of pawn structure, movility, king safety.

Maybe there can be diferent versions of the test, each one of them with diferent
number of positions, giving an error marging to each version.

A good test suite specially oriented to chess programmers could be really great,
again, not looking for perfect results, but aproximations usefull for us, and
not only giving a number, but maybe several numbers of the evaluation of several
aspects.

is there any test that do that?
or is it just a dream?
if it is a dream, please share your method to test your engine after
modifications. The number of modifications we make to our engines is big, so the
time used in testing it's a really big, maybe 90% (not sure about it, what do
you say)

There are several problems with making tournaments to get performance of the
engine:
- Select appropiate number of engines
- Selection the engines
- Number of games
- Gauntlet, round robin or swiss
- Time control.
And after making the tournament, how to know aproximate error margin of the
result? or the performance of the engine in several aspects (performance in
oppening, middle, endgame, care of positional aspects)





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.