Author: James Swafford
Date: 15:14:41 10/31/01
I'm playing around with a hyperbolic tangent function in order to predict a reward [-1 ... 1] based on my raw evaluation score of the principal variation. I've come up with predicted_reward = tanh(pawn_adv/300), where a pawn advantage of 1 pawn ---> pawn_adv=100. The following table shows the relationship between pawn_adv and predicted_reward: pawn_adv predicted_reward .1 .033 .25 .083 .33 .11 .4 .133 .5 .165 .75 .245 1 .32 2 .58 3 .76 4 .87 5 .93 6 .964 7 .981 8 .990 9 .995 10 .9975 12 .9993 15 .9999 So... a 1 pawn advantage yields a predicted reward of .32. Has anybody done research, or know of research, that can tell me how close those figures are? i.e. if your program obtains a 1 pawn advantage, do you know how likely it is to win? Tridgell and Baxter's paper says they give a one pawn advantage a predicted reward of .25, but it doesn't say why they chose that number. Maybe they pulled it out of thin air, I don't know. Comments? Anybody know of a better function than tanh() to do this? -- James
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.