Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Test Suites -- How do you decide on pass or fail?

Author: Bruce Moreland

Date: 13:21:10 06/28/00

Go up one level in this thread


On June 28, 2000 at 06:57:46, Tim Foden wrote:

>I am just adding code to GreenLight to run EPD test suites... but I have hit a
>problem.
>
>Just how do you mark the results?!
>
>At the moment I am doing this:
>Take static eval of start position.
>Start engine thinking.
>If the correct move is seen with (eval >= start_eval + 1.00) then say passed.
>If after max. time, move is not taken -> mark as failed, else mark as uncertain.
>
>I guess I could just go for the criteria that if 3 plys in a row see the correct
>move, then it passes?  Does this work?

Sometimes you'll get a dumb move selected for the first three plies, for no
apparent reason.  If this dumb move happens to be the answer to the problem, you
get a bad result.

That's the disadvantage of the "hold for a while" approach.  The advantage of
course is that you could do a one-minte per position WAC test in about ten
minutes.

I prefer to use "hold until end of test".  You record the time taken to find the
move, but if you switch away from it, you reset the time.  If you find it again
later, you get the longer time.

This makes a 300-position 1-minute test take 5 hours no matter how easy it is,
but I think the results are a little more likely to make sense.

bruce



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.