Author: Bruce Moreland
Date: 13:53:46 01/13/98
Go up one level in this thread
On January 13, 1998 at 15:49:58, Don Dailey wrote: >>This is not perfect. I should probably normalize the numbers somehow so >>that problems in which both of the programs finish very close to the >>maximum allowable times don't get more weight than those in which >>neither of them can quite finish that last ply. Also, this doesn't take >>into account that one of the versions might be getting closer to the >>real answer, and therefore is taking more time per ply. And finally, I >>have had a problem with disk caching -- the second run on any given >>night usually goes faster than the first one, so when I run these >>suites, some of the results are a little bogus. > >All of this stuff is a mess. I don't think the way problem sets are >typically scored make much sense. They should give credit for quicker >solutions in my opinion not just total solved in less than x minutes. I'll try again. My first response to this was chopped off due to lag. This is not how I am using this. I am not checking times to solution, at all. I am comparing time to finish the last ply that both versions finished. If version A finishes 8 plies in a position, and version B finishes 9, I compare the times taken to finish 8 plies. If version A and version B don't differ very much, and you use a suite that is large enough, and positional enough, this can tell you which is faster. I would mainly do this if I am trying to figure out if a performance change really increased performance. Some people try to figure out if they have gotten faster by looking at nodes per second, which is wrong, because doing more nodes says nothing about whether you've also decreased efficiency. If I add a new extension, I will still collect this information, because I think it is interesting to know what effect my extension has had upon overall search depth. A version that takes twice as long, on average, to get to depth D, might be a little suspect unless it is somehow solving *everything* faster. But I think that testing like thisis especially useful when you are trying to figure out if you have improved move ordering, or have made a significant performance change somehow. bruce
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.