Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: an example how users - not programmers - use tests

Author: David Dahlem

Date: 12:17:47 06/20/04

Go up one level in this thread


On June 20, 2004 at 13:42:59, Steve Glanzfeld wrote:

>On June 20, 2004 at 12:10:57, David Dahlem wrote:
>
>>On June 19, 2004 at 12:02:19, Steve Glanzfeld wrote:
>
>>>Users are interested in the engine's performances in such tests, simply.
>>>Assingning ratings certainly isn't the main thing. Most often it is sufficient
>>>to count and compare the number of solutions.
>
>>I've seen numerous examples of one engine solving a test suite position in a few
>>seconds, while another engine of known equal game playing strength never finds
>>the solution, even after hours of analysis. To me, this makes test suites
>>worthless, or at least very difficult to interpret the results.
>
>As long as your test suites consist of ONLY ONE SINGLE POSITION, you're right
>:)))
>
>Steve

Note that i said "numerous examples". So if, for example 10% of a test suites
positions give unreliable results, how reliable are the total results? And there
could be more than 10%, how is a test suite user to know? I gave an extreme
example, suppose in a 20 minute per position suite, out of a group of equal
playing strength engines, some engines fail to find some solutions, how is a
user to know if the same principal applies? Test positions can be useful for
debugging an engine, but for strength testing between engines, i think it's a
waste of valuable time. :-)

Regards
Dave



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.