Author: Christophe Theron
Date: 09:36:41 06/11/02
Go up one level in this thread
On June 11, 2002 at 01:42:10, Uri Blass wrote: >On June 10, 2002 at 22:29:44, Christophe Theron wrote: > >>On June 10, 2002 at 15:28:14, Rajen Gupta wrote: >> >>>i have read somewhere (i think it was hinted in one of the interviews which >>>frank morsch gave to one of the indian newspapers)that at any given time, there >>>are several different versions of fritz being developed:- the inference being >>>that and the one that is actually released is not necessarily the strongest one; >>>its the one that is just strong enough. >>> >>>frank morsch apparently has one ready whenever a new upstart arrives on the >>>scene.i wont be surprised if there is no new fritz till something overtakes the >>>current version. >>> >>>rajen >> >> >> >>It does not make sense. >> >>Look at the small margin between Fritz and the program just behind it (Tiger) on >>the SSDF. >> >>Why would Frans take the risk of publishing an engine that might fail to achieve >>the first place on the SSDF if he has something better? > >Maybe he does not know which engine is the best. > >The only way to be sure that engine A is better than engine B is by games. >You can always have other tests in order to guess but they are only an estimate. > >I know that you say that you do not use games against other opponents but I >think that it is a mistake. > >The fact that you probably have some test that usually gives >the same results as games is a good reason to use that test for testing one >change but when you decide to release a new version the only way to be sure that >it is better is by a lot of games(unless the change is only doing tiger faster). > >Uri In order to have a top chess program you must have a method to decide if a change is an improvement or not. One of the requirements of this method is that you must be able to get a result in a short period of time (preferably less than 4 days in the most difficult cases). There are many little changes to test before you get a version significantly stronger than your last release. It is not practical to let people test several versions and decide for you because you can't rely on results you have not controlled yourself (there are too many possibilities of inconsistencies even in the experiments you set up yourself) and because these people would have to play a lot of games under equivalent conditions in order to get statistical relevance (which you seldomly get, because you cannot ask people to play 500 games in a row). I cannot believe that a serious chess programmer would use such a lousy selection method. Testers feedback is very valuable to spot problems or lacks in the program's knowledge, bugs, and more generally good advices on general directions to work on. Testers feedback is used to get quality data, human advice and creativity, you generally cannot use it to get a quantity of statistically relevant data. The final decision about what is an improvement and what is not must be taken by the programmer himself, with a cold, scientifically controlled, objective, test method. Christophe
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.