Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hello from Edmonton (and on Temporal Differences)

Author: Rémi Coulom

Date: 01:08:09 07/31/02

Go up one level in this thread


I have just completed a PhD thesis on temporal difference learning (applied to
motor control, not games), and I also believe that this technique has not yet
been used to its full potential in computer chess.

Knightcap/TDChess was an interesting experiment, but the strength of their
program was not high enough. I talked to some authors of stronger chess programs
that tried reinforcement learning. They told me it did not work well for them
(Franck Zibi, Pascal Tang, and maybe also Sylvain Renard, if I recall
correctly). I also remember Christophe Théron saying he does not believe it
could help to improve his program. So, this is not very encouraging.

Nevertheless, I still believe that reinforcement learning can be applied
efficiently to computer chess. The key issue is that it requires more effort
than just implementing the simple algorithms that Baxter et al. describe, play a
few hundred training games and observe the result. Applying reinforcement
learning efficiently requires a deep understanding of theory, creativity in
selecting the right algorithm, a well-adapted evaluation-function architecture,
and _lots_ of training data. Tesauro's backammon player took months of CPU time
to learn to play. I have been running motor-control experiments for months of
CPU time as well, and my learners are still making new interesting discoveries.

Also note that all book learning algorithms are reinforcement learning
algorithms, whether their authors know it or not. So, reinforcement learning has
already been applied successfully to high-level chess programs!

Trying reinforcement learning in The Crazy Bishop is the first item in my list
of ideas to try. It is not likely to be very soon, though. It has been some
years already that others activities have had higher priorities for me.

If you are curious, you can take a look at my thesis:
http://remi.coulom.free.fr/Thesis/
This web page contains interactive demos of swimmers that learn to swim, and a
car driver that learns to drive.

Rémi



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.