Author: Dagh Nielsen
Date: 02:02:53 01/03/06
Go up one level in this thread
On January 03, 2006 at 00:37:25, Joseph Ciarrochi wrote: > >> >>It depends on the stage or the phase of the game, an static evaluation by a >>computer engine of less than +1.0 in an opening stage is less reliable when it >>gives an evaluation less than +1.0 in the endgame stage. Thus, an evaluation of >>less than 1.0 is not quite a severe disadvantage in the opinion of chess praxis >>because there some dynamic compensation for Black. Dynamic compensations are >>not static values. Its value can change completely as the position unfolds more >>and more to a static position. Hence, an static evaluation of 1.0 is a great >>advantage when the position is void of dynamics or the position approaches to a >>more static phase of the game, for example, the transition to an endgame. Even >>in an endgame an advantage of +1.0 for either side is not very much if it is a >>rook ending, compared to a minor ending, Knight vs. Bishop, the +1.0 advantage >>can be enough advantage to convert to a bigger advantage. My recommendation is >>to stop evaluating the position from a STATIC point of view, but start to >>evaluate the position more from a DYNAMIC point of view, this is what IM Andrew >>Martin is reccommending. >> >>My 2 cents, >> >>Laurence > >This is a brilliant post, Laurence! Thanks for that. > > >It would be great if the computers could caputure the uncertainty of the >evaluation (e.g., a particular rook ending might be +1 with a plus/minus .8 >confidence interval) whereas knight ending might often be +1 with plus/minus .2 >confidence interval). Is this possible? In my understanding, a good evaluation function already integrates "confidence". This is very important when the engine has to decide which endings to enter. Opposite colored bishop endings are drawish, so +1 pawn should not give the same score as +1 pawn in a knight ending. Queen endings are drawish too etc. In a deeper sense, an evaluation may be better understood as a prediction of the result than as a counter of n pawns' advantage. The task is then to calibrate the evaluation, depending e.g. on piece set and extramaterial factors, so that the engine sorts its predictions well. There is a good reason to build up an evaluation function around the value of a pawn, namely that all the experience gathered through 100's of years about relative piece values can then be added straightforwardly. But I wonder if there is any merit to the idea of letting the evaluation give a predicted percentage score directly instead (that is, a position is evaluated at 63% for some white advantage). One main idea is to get into the habit of THINKING in this way while tuning the evaluation function, maybe by collecting stats and process them in relation to feature bonuses (point being that a -320,320 interval can of course be mapped more or less straightforwardly onto a 0%-100% interval anyway). But also, once you do it this way, you get some kind of absolute point of reference. Say, you have worked a lot on rook ending evaluation. You take a large set of random rook endings with white one pawn up, and average your static evaluation of these positions and end up with 80%. BUT when played out by top engines, the score is only 63%. THEN your evaluation is probably misleading; it may be good for sorting predictions in +1 pawn rook endings, but once your engine has to compare evaluation of rook endings vs., for instance, knight endings, you would probably have a problem. But at least you KNOW you have a problem :-) If the evaluation is based on the usual pawn unit instead and you have no standard conversion to percentage predictions, there really is no way to decide that the 0.93 average is too high and should be somehow calibrated down to 0.72 instead in order for the engine to compare rook ending advantages with knight ending advantages appropriately. But back to confidence... While a static evaluation can be given an easy interpretation as predicted score, the issue gets a lot messier when dealing with root position evaluation from min-max of leaf node evaluations. A typical example: Would you rather play a move evaluated at depth 4 at +1.00 than another move evaluated at depth 12 at +0.80? The answer is simple, but the consequences for engine implementation is not :-) The main reasons: 1) Most moves in the tree are not precisely evaluated, but only given caps on evaluation (worse than, alpha-beta). 2) Engine use iterative search anyway and have a current global reference depth. The big task seems to be to find a way to interpret the profile of an alpha-beta generated tree in smart ways that makes you make "confident" adjustments to evaluations and confident decisions about the value of different search directions. One interesting question to pose is, given a tree of analysis with leaf node evaluations, which node would you expand in order to "improve" the tree of analysis the most, if given a free choice, but only one node to expand? And, a follow up question, exactly how would you measure the success of such a decision? It gives you the greatest improved confidence that your tree of analysis will decide on the "ultimately right" move one step ahead? (consider: best confidence is not the same as most profitable, the possible fatal consequences of different decision are not measured and integrated in the question) What if you can expand 100 nodes, one at a time, will the best one-step strategy lead to the best "after 100 steps" strategy? Not necessarily, but how often? Sigh, all this is highly dynamic and chaotic and holistic and... In the end, each move just have 3 possible principally correct evaluations, and the rest is just qualified speculation. Some strategies work well, others don't. Conspiracy search is one attempt to tackle the above considerations. I would be interested to hear what Vasik Rajlich has to add about that :P Regards, Dagh Nielsen > >I have another question? Why are computers more prone to error with dynamic >positions? Rybka at 19 ply still sees .8 for white. If rybka is this deep, >doesn't it see how the position changes? > >Still a little confused :( > >best >Joseph
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.