Author: Drexel,Michael
Date: 02:03:49 09/30/03
Go up one level in this thread
On September 29, 2003 at 19:48:14, Jeroen van Dorp wrote: >On September 29, 2003 at 19:29:40, Uri Blass wrote: > >>He already explained his reason not to do it in the following words >> >>"You can not manually test 400 positions for each engine in reasonable time." > > >I read his explanation. Is the impossibility to test something suddenly >validating a wrong test method? > > >>>I guess that he tried to do the best test in limited time and he gets some good >>estimate for the strongest engine in similiar positions. > >The whole _point_ is that there's a bigger chance than he suggests that he is >_not_ getting the valid estimate he is looking for. The chance is in fact very small. > >It could very well be that the results don't change. There's no doubt about that >chance. But if engine 1 changes his assessment on only 3 out of 10 tests, it >could well end up in the middle or even at the bottom of the pile instead of at >the top. Actually I repeated 3 of 8 test runs with extra ply 99 for WM-Test King attack. The King 3.23 (AntiCM) 19/38 16.63s/38.31s (21/38 before) Shredder 7.04 15/38 17.38s/43.17s (17/38 before) Hiarcs 9 11/38 20.73s/48.63s (13/38 before) All engines performed slightly worse. Exactly what I had expected. > >A test suite already is a creaky way to compare engine strenghts, and limiting >the metod even further only diminishes result value. If he uses a method to test >engines, he should use it the way it was meant to be used, else the results grow >in inaccuracy and diminish in useability. There is no way it was meant to be used. Extra ply = 1 is even default setting. Michael > >There's no law forbidding him to stick to his own method, there's just a rule >that using the wrong method generally generates the wrong results. > >J.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.