Author: Steven Edwards
Date: 18:30:42 12/08/03
Go up one level in this thread
On December 08, 2003 at 20:38:03, Russell Reagan wrote: >On December 08, 2003 at 20:27:33, Steven Edwards wrote: > >I like the NATS idea. > >>One plan is to categorize position difficulty based on >> >> d = log N d: difficulty N: node count to solution > >Would the method for counting nodes be standardized? Some count nodes >differently than others. For instance, one might not count the nodes visited >during a null-move search and instead count the root node from which that >null-move search was performed as a single node. Some programs do a great deal >more work per node. Maybe more than one metric should be used. I think time is a >better one than nodes. Nodes seem dependent upon the design philosophy of the >engine writer. While the concept of a node varies from program to program, it is always the same (one hopes) within any given program. And that's all we need to perform a normalization. Example: consider four programs each of which run the same test suite under tournamament conditions (hardware and time). The largest common subset of solved problems is selected from the result and the total N (sum of the solved position node counts) from each program for this subset is then computed. This give four separate and very likely different sums. Now, for each program, we calculate a separate normalization factor given by the reciprical of the N value sum and then multiply the individual node counts for the problem subset results for that program. This gives comparible numbers across the programs, so they can then be averaged. These means, one for each position in the subset, can then be treated to the log N difficulty metric. There are other ways of doing this, but they'll all give about the same rankings. Sligntly modified procedures are used for positions solved by only a subset of the programs, including position solved by no programs at tournament effort levels. >The times would go down as hardware gets faster, but that is part of the point >of test suites, to determine how computer chess has advanced, and hardware is a >part of that equation. And the process can also be used to normalize results by the same program running on different contemporary platforms. ---- I hope to have the starting velsion (the NATS/2003) finished by the end of the month. It can act as a seed for test suite development during 2004. Of course, we'll need the NACCA membership to help. ---- Back when the first EPD test suites were published, I thought that the ICCA should have come up with its own test suite, formally tested by its members and with versions periodically emailed and posted (and archived). It would have been a big help. I would have done it myself but was busy with other chess topics. So the NATS, like several other goals of the NACCA, is intended to assist the active CC researcher in a practical manner in ways beneficial to all from neophyte to old-hand.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.