Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: New version of EXchess

Author: Dan Homan

Date: 05:41:09 11/19/00

Go up one level in this thread


On November 19, 2000 at 03:05:19, pavel wrote:

>On November 19, 2000 at 02:49:49, Aaron Tay wrote:
>
>>On November 19, 2000 at 01:47:13, pavel wrote:
>>
>>>On November 18, 2000 at 23:40:08, Dan Homan wrote:
>>>
>>>>On November 18, 2000 at 23:23:27, Brian Richardson wrote:
>>>>
>>>>>On November 18, 2000 at 23:06:36, Dan Homan wrote:
>>>>>
>>>>>>I've just put a new version of EXchess up on my website:
>>>>>>
>>>>>>http://pc.astro.brandeis.edu/BRAG/people/dch/chess.html
>>>>>>
>>>>>>The new version (v4.01) adds Temporal Difference evaluation learning to the
>>>>>>previous version (v3.14).  I am not sure that this really increases the strength
>>>>>>of the program, but it was fun to work on.  There are a couple of other minor
>>>>>>enhancements to the search and opening book code.
>>>>>>
>>>>>> - Dan
>>>>>
>>>>>About how many games with TD learning have been played and did it change your
>>>>>evaluation function much?
>>>>
>>>>
>>>>I've played hundreds of games, but I've also reset the learned values back to
>>>>the original parameters many times as well.  For the parameters which come with
>>>>the released version, I am not sure how many games contribute.  Another wrinkle
>>>>is that the program only 'learns' after a loss, so the number of 'learning
>>>>games' is smaller than the number of games played.
>>>>
>>>>One consistent result is that TD learning wants a smaller value for passed
>>>>pawns than I was using before (about 75% of my original 'hand-tuned' value).
>>>>Also my knight-outpost and bishop-outpost values are consistently increased
>>>>by the TD learning by a factor of 3 or 4.
>>>>
>>>> - Dan
>>
>>
>>
>>
>>>can you elavorate TD learning?
>>>as far as I know, it fixes value after each game.
>>>
>>>is there any file generated by the program as a .lrn file, which increases after
>>>game?
>>>or the eval is tuned externally?
>>>
>>>
>>>pavs
>>
>>There is a score.par file that changes after each loss. But I don't see anyway
>>to combine learning from other sources much like you can import learning from
>>other sources for crafty's book..
>>
>>Is there a way? Otherwise if each new version of EXchess came with a new
>>score.par file , does that mean the learning each user has will be tossed out?
>>

There is no way to do this yet.  I'll try to think if there is a good way to
integrate user-learned values in future versions.


>>I'm also curious about how Exchess decides what to tune after each loss. How
>>does it "know" what evalution scores to change?
>>

Ah, this is the "Temporal Difference" learning stuff.  It is all explained in
the Knightcap ICCJ articles.  A simple explaination is that a small change is
made in each parameter...  this change is tested to see how significant a change
it would have made in the final score for each position and that significance is
scaled by how close that position is to the end of the game...  the significance
of the change is used to judge which parameters should be adjusted and by how
much.

>>I will run 100 blitz games first vers various strong opponents (to maximise
>>losses..:( !! ) and see how the score.par changes. Currently, i see the passed
>>pawn value dropping quite significantly and knight outpost values increasing
>>inline with what the author found..
>

Are you starting from the original "hand-tuned" values (by deleting the provided
score.par file)? Or are you starting from the provided score.par file? Just
curious.

>
>also one more thing I noticed in the webpage, it's a brute force proram.
>no selective search?
>

It is just a normal Null move program with search extensions...  So there is
quite a bit of selection in the search.

>also it seems if the program uses losses to tune it;s eval, then it can be
>misinterpreted.
>for instance, result from a game that was played in 5 min/game, will have lesser
>values of importance then a game that was played at 40/40 or 40/60.
>
>
>so it can be interpreted that an exchess version that plays only 5min blitz will
>have one kind of tuned eval, while the other one that plays mainly 40/40 will
>differant. Which one is  best? :)
>

Yes, the learning results from 5 min blitz games might be quite different from
40/40.  One test I did was 200 games with crafty at 1 min bullet.  EXchess lost
a large percentage of the games in the first 100, but did much better in the
second 100 (EXchess still clearly lost the match, but the results were
significantly closer than the first 100 games).  However, when I then matched
this version of EXchess against GNUchess at longer time controls (5 min blitz, I
think), my results were not better (and actually a bit worse) than before the 1
min bullet games against crafty.

 - Dan

P.S.  Unfortunately I don't still have the pgn from most of these test games.
This is the reason I can't give more specific test results.  My computer started
to die a couple of months ago, and when I replaced it with a new machine, I
didn't get all the files I wanted copied over in the confusion.
So the results I am quoting here are from memory and may be innaccurate.  Feel
free to check the behavior of the learning with your own tests.

>this are mainly assumption, as I need to know (more clearly) what TD learning
>is.
>
>
>thanks
>pavs.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.