Author: Mike S.
Date: 14:23:52 06/12/04
Go up one level in this thread
On June 12, 2004 at 11:32:03, Robert Hyatt wrote: >(...) >This shows that such tests are basically flawed. The test should state "The >time to solution is the time where the engine chooses the right move, and then >sticks with it from that point forward, searching at least 30 minutes more..." Why "should..."?? This *is* the condition for a correct solution in the WM Test and ever has been, with the exception that the max. time is 20 minutes/pos. A solution is counted from the time when an engine has found *and kept* the solution move until the full testing time of 20 minutes. Rolf fails to inform you about that, or he doesn't know it himself. Does that surprise you? (You can always claim that the test time is too short, but if you for example run every position for a whole day, you'll still find engines which would switch to a wrong move after 26 hours. So you have to draw a line somewhere - and 20 minutes/pos. is a time for "intensive analysis;" a normal game usually will nearly never take more than 10 minutes per pos. and not more than 3 minutes/pos. average...) http://www.computerschach.de/test/WM-Test.zip (English version included, and results of 4 Crafties.) I hope you didn't assume the WM-Test authors and the complete audience who uses it, are idiots who count a "pseudo solution" which is found i.e. after 12 seconds, when from 42 secs. to 7 min. an engine switches to a wrong move etc.etc. ?? Of course not. A high percentage of CSS readers are experienced advanced computerchess users (at least). CSS itself has built, informed and developed that expert's audience (I guess the US has nothing comparable, unfortunately). - Also, advice has been given to set the "extra plies" parameter for automatic testsuite functions to 99, to ensure that the complete testing time is used, for each position. But in general, we have recommended to test manually and watch the engine's thinking process to get impressions so to speak. I'm a bit disappointed about your statement that "...such tests are basically flawed. The test should," when indeed it *does* just that. >That stops this kind of nonsensical "faster = worse" problem. Because as is, >the test simply is meaningless when changing nothing but the hardware results in >a poorer result... Are you aware that only some (few) of the positions are affected by that problem? The WM-Test has 100 positions. Some engines show that behaviour in some of the positions (different engines in different positions). Some fail to finally solve due to that, some solve but would change to a wrong move after 20:00, etc. Can you guarantee that any single test position you use (and pls don't tell me you use nove :-)) is not affected from that problem? Who can guarantee that? Engines are creative in finding ways to decide for the correct move, but for the wrong reason, sometimes... You are aware that it is very difficult to avoid it to 100%, especially when a large test suite is compiled? So please be fair. Regards, Mike Scheidl
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.