Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Test Suites -- How do you decide on pass or fail?

Author: blass uri

Date: 01:59:22 06/29/00

Go up one level in this thread


On June 28, 2000 at 16:21:10, Bruce Moreland wrote:

>On June 28, 2000 at 06:57:46, Tim Foden wrote:
>
>>I am just adding code to GreenLight to run EPD test suites... but I have hit a
>>problem.
>>
>>Just how do you mark the results?!
>>
>>At the moment I am doing this:
>>Take static eval of start position.
>>Start engine thinking.
>>If the correct move is seen with (eval >= start_eval + 1.00) then say passed.
>>If after max. time, move is not taken -> mark as failed, else mark as uncertain.
>>
>>I guess I could just go for the criteria that if 3 plys in a row see the correct
>>move, then it passes?  Does this work?
>
>Sometimes you'll get a dumb move selected for the first three plies, for no
>apparent reason.  If this dumb move happens to be the answer to the problem, you
>get a bad result.
>
>That's the disadvantage of the "hold for a while" approach.  The advantage of
>course is that you could do a one-minte per position WAC test in about ten
>minutes.
>
>I prefer to use "hold until end of test".  You record the time taken to find the
>move, but if you switch away from it, you reset the time.  If you find it again
>later, you get the longer time.
>
>This makes a 300-position 1-minute test take 5 hours no matter how easy it is,
>but I think the results are a little more likely to make sense.

There is one exception when your program finished it search in less than 1
minute.
It can happen when your program find a forced mate or if it went to the maximal
depth that it is programmed to play or when the position is tablebase position.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.