Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hello from Edmonton (and on Temporal Differences)

Author: Robert Hyatt

Date: 20:58:05 08/04/02

Go up one level in this thread


On August 04, 2002 at 09:04:39, Vincent Diepeveen wrote:

>On July 31, 2002 at 21:35:32, James Swafford wrote:
>
>>On July 31, 2002 at 18:10:08, Vincent Diepeveen wrote:
>>
>>>On July 30, 2002 at 22:43:36, James Swafford wrote:
>>>
>>>>
>>>>Hey everyone.  I'm at an AAAI conference in Edmonton.  It's ironic (to me)
>>>>that it's been mentioned here recently that Edmonton is a hive of computer
>>>>chess enthusiasts.  I don't know if that's true (what's a "hive"? :-), but
>>>>there are certainly a few...
>>>>
>>>>Now to my question.  I asked Jonathon Schaeffer today (who is a really
>>>>nice guy, IMO) some questions about his experience with TD learning
>>>>algorithms.  He's (co?)published a paper entitled (something like)
>>>>"Temporal Difference Learning in High Performance Game Playing."  I
>>>>thought the title was a bit misleading, because he focused on checkers.
>>>>Checkers programs have much smaller evaluation fuctions than chess
>>>>programs, obviously.  I asked him if he thought the TDLeaf(Lambda)
>>>>algorithm had potential in high calibre chess.  (Yes, yes, I know
>>>>all about Knightcap... but that wasn't quite "high" calibre.)
>>>>He responded with a very enthusiastic "yes".  He said "I'll never manually
>>>>tune another evaluation function again."
>>>
>>>And he'll never do a competative chessprogram again either, he forgot to
>>>add that too.
>>>
>>>>A natural follow up question (which I also asked) is -- then why isn't
>>>>everyone doing it??  I don't _believe_ (and maybe I'm wrong about this)
>>>>that any top ranked chess programs use it.  His response was simply:
>>>>"There's a separation between academia and industry."  Schaeffer stated
>>>
>>>Schaeffer is well known for his good speeches and answers :)
>>>
>>>>that perhaps the programmers of top chess programs don't believe in
>>>>the potential of temporal difference algorithms in the chess domain.
>>>>Or, perhaps, they don't want to put the effort into them.
>>>
>>>>I believe Crafty is the strongest program in academia now.  If not,
>>>>certainly among the strongest.  So, Bob -- have you looked at TDLeaf
>>>>and found it wanting?  It's interesting (and perplexing) to me that
>>>>paper after paper praises the potential of TDLeaf, but it's _yet_ to
>>>>be used in the high end programs.  Knightcap was strong, but it's
>>>>definitely not in the top tier.
>>>
>>>I remember Knightcap very well. TD learning had the habit to slowly
>>>make it more aggressive until it was giving away a piece for 1 pawn and
>>>a check.
>>>
>>>Then of course the 'brain was cleared' and experiment restarted.
>>>So in short the longer the program used the TD learning the worse it
>>>would play, from my viewpoint.
>>>
>>>Definitely from a chessplayers viewpoint it did. Of course we must not
>>>forget that in the time it played online, that nearly no program was
>>>very aggressive. So doing a few patzer moves was a good way to get from
>>>perhaps scoring 11% to 12% or so.
>>>
>>>>Maybe Tridgell/Baxter quit to soon, and Knightcap really could've been
>>>>a top tier program.  Or maybe the reason nobody is using TD is because
>>>>it's impractical for the large number of parameters required to be
>>>>competitive in chess.  Or maybe Schaeffer was right, and the commercial
>>>>guys just aren't taking TD seriously.
>>>>
>>>>Thoughts?
>>>>
>>>>--
>>>>James
>>
>>
>>So, I can put you on record as saying that TD-Leaf is never going to
>>produce a high calibre player?
>
>For a complex evaluation TD learning will never achieve what handtuning
>by an experienced chess programmer is doing. That is a statement i'm
>willing to make.
>
>Of course if you start with the most stupid tuned set like putting
>everything to zero or everything to -1, then it looks as if TD learning
>and all other random forms of learning are ok.
>
>Same for neural networks and such. I toyed quite a bit with simple
>neural networks, simply because there are several out there to toy
>with.
>
>The major problem is that i for example conclude that open files are
>more important than a pawn in the center, *any* form of general learning
>will never, by definition, being able to conclude the same, for the obvious
>reason that it has no domain knowledge.
>
>We can discuss till chess is solved, but it's definitely a really simple
>case here. The proof is so obvious that it doesn't work, that i am always
>amazed by people who say it works for them.
>
>That must be persons who don't know the difference between a bishop and
>a knight ;)
>
>What i advice is to tune crafty against an opponent where crafty scores
>80% against now. Tuning something in order to achieve < 50% is real simple,
>because thereis no proof that it could be done better.
>
>You really see the difference between automatic tuning and hand tuning
>when an engine is crushing a certain opponent with the hand tuning.
>
>Now automatic tune it to get more than that. to get 90% instead of 80%.
>
>If you have an incredible bad engine and you modify a random thing in
>search, it also might still play incredible bad, but a bit better.
>
>For the stronger engines however this is way harder.
>
>So turn off learning in crafty, find an opponent where it scores well against,
>then autotune crafty. It has a very small evaluation, and the few patterns
>it has, they are even requiring no arrays to tune. so very little
>parameters are there to tune. Should be easy nah?

No arrays?  Have you looked at the code?  It has many arrays.  Some of which
are used in a third-order fashion.  lookups and summations from a first array,
then that value is used to index into a second array...  some sums of those
and that value indexes into a third array...

I think TD learning would be tough.  But I don't see why it can't work.  Just
because it might be hard to do doesn't mean it is impossible to do...



>
>Just proof it. We only know a program which was bad and which did even worse
>after tuning. And even that was a good achievement compared to the results
>i had in autotuning.
>
>It managed to give a penalty for passers!!!
>
>>--
>>James



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.