Author: Dann Corbit
Date: 11:52:57 04/27/00
Go up one level in this thread
On April 27, 2000 at 12:05:21, ujecrh wrote: >On April 26, 2000 at 20:52:45, Dann Corbit wrote: >>On April 26, 2000 at 20:41:31, Flemming Rodler wrote: >> >>>Hi all, >>> >>>I was just wondering how people determine if a new version of the program they >>>are developing is better than the current. Are there other methods than just >>>testing it against a whole bunch of other chess engines? >> >>That's the most important one, and (really) the only one you can trust [provided >>that you include human opponents as well]. >> >>They also test with EPD test suites. Not reliable indicators of program play. >> >>Sometimes the programmer will have a pretty good idea that a change has helped >>before testing. > >Improvement in one suite probably does not mean anything but if you test the >program against a bunch of suites and it behaves better in almost all of them >then I guess you can take it as a good sign that the engine is improved. A good sign, yes, but a very artificial one. And quite possibly giving a bad answer. It is easy to alter engine code so that it scores well on test suites but plays terribly. Look at Ed Schroder's "personality" tests and you will see that the settings that do well on EPD test suites are not the same settings that play well in games.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.