Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Testresults WAC, LCT II, WM (K)

Author: Jeroen van Dorp

Date: 16:48:14 09/29/03

Go up one level in this thread


On September 29, 2003 at 19:29:40, Uri Blass wrote:

>He already explained his reason not to do it in the following words
>
>"You can not manually test 400 positions for each engine in reasonable time."


I read his explanation. Is the impossibility to test something suddenly
validating a wrong test method?



>I guess that he tried to do the best test in limited time and he gets some good
>estimate for the strongest engine in similiar positions.

The whole _point_ is that there's a bigger chance than he suggests that he is
_not_ getting the valid estimate he is looking for.

It could very well be that the results don't change. There's no doubt about that
chance. But if engine 1 changes his assessment on only 3 out of 10 tests, it
could well end up in the middle or even at the bottom of the pile instead of at
the top.

A test suite already is a creaky way to compare engine strenghts, and limiting
the metod even further only diminishes result value. If he uses a method to test
engines, he should use it the way it was meant to be used, else the results grow
in inaccuracy and diminish in useability.

There's no law forbidding him to stick to his own method, there's just a rule
that using the wrong method generally generates the wrong results.

J.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.