Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Test Suites -- How do you decide on pass or fail?

Author: Enrique Irazoqui

Date: 04:23:15 06/28/00

Go up one level in this thread


On June 28, 2000 at 06:57:46, Tim Foden wrote:

>I am just adding code to GreenLight to run EPD test suites... but I have hit a
>problem.
>
>Just how do you mark the results?!
>
>At the moment I am doing this:
>Take static eval of start position.
>Start engine thinking.
>If the correct move is seen with (eval >= start_eval + 1.00) then say passed.
>If after max. time, move is not taken -> mark as failed, else mark as uncertain.
>
>I guess I could just go for the criteria that if 3 plys in a row see the correct
>move, then it passes?  Does this work?
>
>Any suggestions welcome.

I like to use tactical positions with a solution that programs can't pick unless
they see the whole line, so I can run the test automatically overnight without
having to watch the evaluation. For instance, yesterday's position
2R5/4k1pp/p3p3/4P1p1/p3N3/q1P4P/2P1P1P1/1K6 w - - 0 1; bm s4d6; is not useful
because Nd6 can be picked for the wrong reason, as Shredder and SOS did. Once
the program has found the solution, and just in case, I let it run for 5 more
plies or until a given time limit, usually 15 minutes.

Enrique





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.