Author: Brian Richardson
Date: 15:33:17 10/24/04
I am again looking at adding TD learning to Tinker. Thank you to those that have provided comments and suggestions thus far. My concern is that the position that the final score is based upon must be saved. This is awkward to do, it seems. So, I walk down the PV as far as possible, and then look at that position. Unfortunately, the PV does not normally extend far enough, due to quiesce search, or other search "instabilities". The good news is that I found several PV and search bugs, and things have improved. The bad news is that for TD learning, sometimes the final score does not match the evaluation score for the walked PV position. Then I tried matching the final score with a qsearch score from the walked PV position. This almost always matches, but not _all_ of the time. For example, for Tinker, running 8/8/7k/8/4p1K1/8/5P2/8 b - - Fine16 bm e3 nothing matches after 12 ply, but then things stabalize and match again for awhile, and then there are more mismatches, and so on. I have tried testing with and without any hashing, pawn hashing, force stuffing the PV into the hash table after each iteration, and some other basic things, but there just seem to be a few cases where it does not match. My question is for those that have already added TD learning to their programs, was this a problem, or perhaps your engines have a "cleaner" PV? I could just run with qsearch instead of eval, but of course that would add quite a bit of time to the learning computation runs. Thanks, Brian
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.