Author: James Swafford
Date: 15:17:39 10/31/01
Go up one level in this thread
On October 31, 2001 at 18:14:41, James Swafford wrote: > >I'm playing around with a hyperbolic tangent function in order >to predict a reward [-1 ... 1] based on my raw evaluation score >of the principal variation. > >I've come up with predicted_reward = tanh(pawn_adv/300), where >a pawn advantage of 1 pawn ---> pawn_adv=100. > >The following table shows the relationship between pawn_adv and >predicted_reward: > Actually, that's pawn_adv/100 vs. predicted_reward... >pawn_adv predicted_reward >.1 .033 >.25 .083 >.33 .11 >.4 .133 >.5 .165 >.75 .245 >1 .32 >2 .58 >3 .76 >4 .87 >5 .93 >6 .964 >7 .981 >8 .990 >9 .995 >10 .9975 >12 .9993 >15 .9999 > >So... a 1 pawn advantage yields a predicted reward of .32. > >Has anybody done research, or know of research, that can tell me >how close those figures are? i.e. if your program obtains a >1 pawn advantage, do you know how likely it is to win? > >Tridgell and Baxter's paper says they give a one pawn advantage >a predicted reward of .25, but it doesn't say why they chose >that number. Maybe they pulled it out of thin air, I don't know. > >Comments? Anybody know of a better function than tanh() to >do this? > >-- >James
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.