Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Chess Tiger 15 vs Fritz7

Author: Christophe Theron

Date: 09:36:41 06/11/02

Go up one level in this thread


On June 11, 2002 at 01:42:10, Uri Blass wrote:

>On June 10, 2002 at 22:29:44, Christophe Theron wrote:
>
>>On June 10, 2002 at 15:28:14, Rajen Gupta wrote:
>>
>>>i have read somewhere (i think it was hinted in one of the interviews which
>>>frank morsch gave to one of the indian newspapers)that at any given time, there
>>>are several different versions of fritz being developed:- the inference being
>>>that and the one that is actually released is not necessarily the strongest one;
>>>its the one that is just strong enough.
>>>
>>>frank morsch apparently has one ready whenever a new upstart arrives on the
>>>scene.i wont be surprised if there is no new fritz till something overtakes the
>>>current version.
>>>
>>>rajen
>>
>>
>>
>>It does not make sense.
>>
>>Look at the small margin between Fritz and the program just behind it (Tiger) on
>>the SSDF.
>>
>>Why would Frans take the risk of publishing an engine that might fail to achieve
>>the first place on the SSDF if he has something better?
>
>Maybe he does not know which engine is the best.
>
>The only way to be sure that engine A is better than engine B is by games.
>You can always have other tests in order to guess but they are only an estimate.
>
>I know that you say that you do not use games against other opponents but I
>think that it is a mistake.
>
>The fact that you probably have some test that usually gives
>the same results as games is a good reason to use that test for testing one
>change but when you decide to release a new version the only way to be sure that
>it is better is by a lot of games(unless the change is only doing tiger faster).
>
>Uri



In order to have a top chess program you must have a method to decide if a
change is an improvement or not. One of the requirements of this method is that
you must be able to get a result in a short period of time (preferably less than
4 days in the most difficult cases).

There are many little changes to test before you get a version significantly
stronger than your last release.

It is not practical to let people test several versions and decide for you
because you can't rely on results you have not controlled yourself (there are
too many possibilities of inconsistencies even in the experiments you set up
yourself) and because these people would have to play a lot of games under
equivalent conditions in order to get statistical relevance (which you seldomly
get, because you cannot ask people to play 500 games in a row).

I cannot believe that a serious chess programmer would use such a lousy
selection method.

Testers feedback is very valuable to spot problems or lacks in the program's
knowledge, bugs, and more generally good advices on general directions to work
on.

Testers feedback is used to get quality data, human advice and creativity, you
generally cannot use it to get a quantity of statistically relevant data.

The final decision about what is an improvement and what is not must be taken by
the programmer himself, with a cold, scientifically controlled, objective, test
method.



    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.