Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: learning evaluation weights (was Re: Genetic algorithms for chess?)

Author: Don Beal

Date: 11:22:53 05/23/98

Go up one level in this thread


On May 23, 1998 at 04:38:19, Komputer Korner wrote:
>[snipped]

Thanks for your interest.

You ask "why not do a 1-ply search?".  The answer is that
we wanted to perform the experiment with the deepest possible
lookahead search that still allowed to us run tens of thousands of
games on a desktop computer in a few weeks.

The deeper the lookahead search, the better the quality of play
given any particular set of piece values.  The better the quality
of play, the fewer games necessary, and the closer the values are
likely to be to those chosen for competitive play.

The goal of our experiment was not to re-invent piece values for
the purpose of finding better ones for a competitive program, but
to prove that the method worked, and satisfy ourselves that it was
robust over a wide range of parameter settings (more than we
reported in the paper).

Within that goal we chose the deepest searches we were willing to
wait for.  I did not say the results were totally independent of
depth.  But beyond depth 3, the values obtained change only slightly
as the search depth increases, but the CPU time per game goes up
exponentially (of course).

It is not easy to predict the results of such experiments.  One
non-obvious effect is that the TD learning process can in principle
(and other things being equal) learn values that can only be
deliberately exploited by searches deeper than those used in the
self-play!

This effect could arise because even if the play is poor (due to
low search depth), a better position could still offer a greater
probability that the right play will be found by accident.  The
TD learning process works by relating *positions* to outcomes
and does not require that the search sees the win.  Over a large
enough number of trials, the learning could respond to a
statistically significant number of "accidental" wins from
superior positions.

Hence it is quite possible the "low" knight value is more to do
with lack of other evaluation terms than lack of search depth.
Or even that the value "ought" to be low.  I don't have enough
information to guess the answer.  Your guesses might be right.

The only way to really know is to do more experiments.

Don Beal.

PS. To answer your other question: the program used in the
matches was the same as used in the learning runs: four ply
plus quiescence, with randomised choice from tactically-equal
moves.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.