Author: Richard A. Fowell (fowell@netcom.com)
Date: 22:22:13 02/09/99
Go up one level in this thread
On February 09, 1999 at 16:30:03, Dann Corbit wrote: >There are some difficulties with EPD I would like to discuss with the experts. >1. There seems to be no way of judging what the originator of an EPD record >intended (e.g. win major piece, gain positional advantage, gain temporal >advantage, achieve draw, achieve checkmate). It would be nice to have an >additional field something like eo "expected outcome" or the like to quantify >what is "intended" by the best move. For the checkmate, at least, you can use the "dm" code - e.g., a "dm 2;" denotes a "mate in 2 problem" As a thought, one could propose a "nag" opcode, which would let one express the positional situation using the NAG evaluations already present in the standard. >2. There is no code in the PGN standard for depth in plies. This seems to be >far more valuable than nodes searched, since some programs like Hiarcs don't >search a lot of nodes but achieve great depth none the less. Crafty uses acd >and Hiarcs uses dep but there does not seem to be any reason to pick one over >the other since the standard does not specify. Well, the standard does say where to send suggestions for new opcodes. The Crafty choice seems more logical to me, since it is in the same class of opcodes as "acs" and "acn". Of course, "acp" would fit that, too. Of course, there is the added problem that different programs reach different plies in the same amount of time. For example, MacChess will search to much deeper plies than HIARCS in the same time, yet HIARCS is far stronger. I think that the key normalization we need here is a "rating" ... see below. >3. The code for seconds (acs) seems pretty valueless unless we have some field >to describe the hardware/software (SetupID or something like that). >Some of the machines in C.A.P. are orders of magnitude more powerful >than others. For speed, alone, when one has "acn" as well as "acs", one has some information about machine speed. However, there remains the issue, since I have software that ranges from 50,000 nps to 400,000 nps on a G3/400 (and the 50,000 nps beast is strongest). My modest proposal here, would be to have an "acr" field, for "analyst computer rating", where one would fill in the estimated rating of the computer (or human), keyed to the SSDF list. One could adjust for the speed of the platform used by ratioing the throughput of the test platform to the throughput of the platform used for the analysis, and using some factor (e.g., the 50-70 points per speed doubling often quoted). Obviously, this has a number of issues: - some folks will object to using the SSDF list, on various principles. - the rating of a given program on the SSDF list fluctuates with time (although, I suspect their quoted uncertainty range is relatively good) - many programs aren't SSDF rated - the "rating" of a computer for a given position or type will vary wildly. For a 3-5 piece ending, Crafty with full tablebases is "acr 9999". For a mate result, a matefinder routine should also be "acr 9999", whereas, for most of the "top six", when they report mate in 10, there may very well be doubt. [ Mind you, this lack is easily fixed - a program could report it's rating adjusted for the position type, using "acr 9999" for results which are certain (well, in the programmer's mind), the regular rating in nominal positions, and a reduced value for positions that the program knows it is weak at.] However, it seems to me that, even given this difficulties, that "acr" combined with "acs" would give you a better generic idea of the quality of an analysis than "acd" can. >4. There is some question in my mind as to how many best moves there can be. I >found out (much to my chagrin) there are sometimes more than one. (A few entries >in rockpile are there because the bm entry is simply truncated..., actually they >have an easily found solution -- I just have not separated them out yet.) It >seems to me that sm and pm could fill roles as secondary and tertiary options, >perhaps ranked "bm", "pm", "sm", (but be careful not to:) "am". > >Comments? Note that the "bm" (and "am") tags will take multiple choices - this is in the spec, I've seen it used, and some of my programs understand this. > >Additional suggestions? I think "acr" is what you are looking for. For "C.A.P." use, you can use "Acr" (per the convention spelled out in the standard for unofficial opcodes). You have already calibrated your testers, so you can add the "Acr" tag to there entries when the results come in ... you can even use the "Acr 9999" tags with a little more processing (that is, understand which egtbs the tester has, and check the positions he returns to see if absolute knowledge is likely.) If "Acr" enters the standard, you can batch convert your "Acr" tags to "Acr", or whatever equivalent opcode is decides upon. If the rating basis used is different, you can postprocess that, too. At one point, you said you had all the raw returns, so you can go back and annotate them now. Regarding getting this into the standard, you can start by sending it to Steven J. Edwards. Richard A. Fowell (fowell@netcom.com)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.