Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Hello from Edmonton (and on Temporal Differences)

Author: Sune Fischer

Date: 11:06:55 08/04/02

On August 04, 2002 at 13:26:05, Vincent Diepeveen wrote:

>On August 04, 2002 at 11:47:01, Sune Fischer wrote:
>
>>On August 04, 2002 at 09:13:31, Vincent Diepeveen wrote:
>>
>>>On August 01, 2002 at 05:16:55, Sune Fischer wrote:
>>>
>>>We must not think too simple about autotuning. It is a complicated
>>>matter. Yet the obvious thing is that the autotuner has no domain specific
>>>knowledge.
>>>
>>>So suppose that someone *manages* to find a good way of tuning.
>>>
>>>Even the simple evaluation of deep blue, which had about a few tens
>>>of patterns and each pattern indexed by an array or so from 64.
>>>
>>>We talk about 5000 adjustable patterns (i like round numbers good) or
>>>so for an average program.
>>>
>>>to tune that in the incredible good
>>>  O (n log n) that's like (using 2 log)
>>>
>>>==> 5000 x 12 = 60000 operations.
>>>
>>>Each operation consists of playing a game or 250 at auto player.
>>>No commercial program ever managed to improve by playing blitz...
>>>
>>>250 games x 60000 = 15 000 000 games.
>>>
>>>Get the problem of learning slowly?
>>
>>No, TDLeaf is a steepest descent algorithm, if it works it will go much faster
>>because it's going directly against the gradient.
>>I'm not saying it will be easy, or that it won't require a large number of
>>games, but I believe its potential is greater than what is humanly possible.
>>
>>
>>>Now we talk about a simple thing called chess, just 64 squares.
>>>If i autotune something to drive my car there are a zillion parameters
>>>to tune ;)
>>
>>Yes, but you tune them _all at once_ so it's really not that bad :)
>
>Exactly the answer i wanted to hear. This is simply impossible to tune
>them all very well at once without domain knowledge. If you get a perfect
>tuning at
>
>O (n log n) you already get a nobel prize.

Well I wouldn't count on it :)
Deepest descent can train hundreds of weights if the energy space is smooth
enough, just think of back propagation.
I believe this will be the case for a chess evaluator, linear forms, independent
variables, should be doable IMO.

>The tuning we need, and definitely traffic, is not 'all a little tuned',
>errors are not acceptible simply. One parameter which is -1.0 instead of
>+0.22, that's an unacceptible error simply. Hand tuning is even
>more accurate. It even sees the difference between 0.22 and 0.40 very
>clearly. In case of the pro's, even the difference between 0.22 and 0.20
>is getting felt in some cases (bishop vs knight).

That could happen if they start off from random values, it would be a so called
local minimum. If they are initialized with the best known values they should
improve from there.

>Obviously you won't achieve such accuracy under O (n log n) with TD learning,
>it's a rude and primitif algorithm which can be considered lucky if the plus
>and minus sign are already guessed right.

With 5000 weights, then yeah you'll need 10000 games, but isn't that doable now
a days? You might see rapid improvement in the beginning, and then the curvature
diminishes and the process is slowed down.

>The pro's are not far off perfect tuning nowadays. Of course the parameters
>can get improved, but definitely not their tuning.

But how do you, as a human, find the best values?
You just sit there, watch it play and then go; hey it should have played e4
instead of e3, so I better adjust...
How much do you adjust, and do you do this for all 5000 weights?
This ad hoc method sucks bigtime, plane and simple.
The pros are not doing it this way, I can't believe that.

>I doubt you can achieve
>much better tuning with crafty too.

If Crafty was a little easier to hack I would do it, but I can't even get it to
compile. (there is some stack space problem with the egtb.c).

>It's programs like DIEP where tuning can be improved bigtime, but of course
>we talk about FINE tuning. Whether it's 0.023 instead of 0.020.
>
>In DIEP i work not at 1/100 of a pawn but 1/1000 of a pawn. Another 1000
>horrors more :)

Floats could be needed for this, integer calculation (even milipawns) do not
work well with sums of fractions...

-S.

Re: Hello from Edmonton (and on Temporal Differences) Vincent Diepeveen 11:46:18 08/04/02
- Re: Hello from Edmonton (and on Temporal Differences) Sune Fischer 12:24:50 08/04/02
  - Re: Hello from Edmonton (and on Temporal Differences) Vincent Diepeveen 13:44:18 08/05/02
    - Re: Hello from Edmonton (and on Temporal Differences) Sune Fischer 15:16:30 08/05/02
  - Re: Hello from Edmonton (and on Temporal Differences) Miguel A. Ballicora 12:43:49 08/04/02
    - Re: Hello from Edmonton (and on Temporal Differences) Dieter Buerssner 15:13:33 08/05/02
      - Re: Hello from Edmonton (and on Temporal Differences) Dan Andersson 15:45:14 08/06/02
      - Re: Hello from Edmonton (and on Temporal Differences) Miguel A. Ballicora 19:24:45 08/05/02
    - Re: Hello from Edmonton (and on Temporal Differences) Vincent Diepeveen 13:56:41 08/05/02
      - Re: Hello from Edmonton (and on Temporal Differences) Vincent Diepeveen 14:09:23 08/05/02
        
        Re: Hello from Edmonton (and on Temporal Differences) Uri Blass 01:45:56 08/06/02
        
        Re: Hello from Edmonton (and on Temporal Differences) Miguel A. Ballicora 19:58:57 08/05/02
    - Re: Hello from Edmonton (and on Temporal Differences) Sune Fischer 13:00:14 08/04/02

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.