Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: deep blue's automatic tuning of evaluation function

Author: Robert Hyatt

Date: 22:00:47 03/22/03

Go up one level in this thread


On March 22, 2003 at 15:10:52, Vincent Diepeveen wrote:

>On March 22, 2003 at 07:21:26, emerson tan wrote:
>
>>I heard that deep blue uses automatic tuning for its evaluation function, does
>>this mean that as it plays games against humans and computers, deep blue will
>>self tune its evaluation function base on the results of its games against
>>humans and computer? If it is, is it effective? Are the other programs using
>>automatic tuning also?
>
>Many tried tuning automatic, but they all failed. Human tuning is way better and
>more accurate. Big problems of autotuners is the number of experiments needed
>before a good tuning is there.
>
>Basically you only can proof a good tuning by playing games against others.
>
>That will in short take another few thousands of years to tune a complex
>evaluation.
>
>Doesn't take away that everyone has tried a bit in that direction and probably
>will keep trying.
>
>Current algorithms simply do not work.
>
>Also the much praised TD learning is simply not working.
>
>What it did was overreact things in the long term. So for example at the time
>that crafty was at a single cpu pro200 and those versions had a weak king safety
>some years ago, it would find out that weakness of crafty not as we conclude it.
>It just concluded that it could by sacraficing pieces and such towards crafty
>king, that this would work a bit.



Why don't you spend more time talking about _your_ program and less time
knocking mine?  Crafty 1996 (pentium pro 200) did just as well against
Diep 1996 as Crafty of today does against Diep of today.  If my king safety
was weak in 1996 so was yours.

Why don't you give up this particular path of insults?  It only makes you
look idiotic.  That "weak" version of Crafty in 1996 finished in 4th place
at the WMCCC event.  Where did yours finish that year?  In fact, I'd bet
that KnightCap had a better record against _your_ program that it did
against Crafty.  Which makes your comparison all the more silly.

Again, you should spend more time working and less time knocking other
programs.  Your program would get better, faster.



>
>but especially 'a bit' is important here. It of course would have been happy
>scoring 20% score or so. So if you gamble 10 games in a row and win 2 in that
>way and without a chance lose the others, then that might seem to work, but in
>absolute terms you are doing a bad job, because scoring 20% sucks.
>
>Of course those were the days that in 5 0 or 5 3 blitz levels the programs got
>very small search depths. Not seldom computer-computer games were tactically
>dominated in these days around 1997.
>
>Concluding that pruning works based upon those slaughter matches (where the
>knightcap stuff got butchered many games in a row then winning 1 game for it by
>some aggressive sacrafice) is not the right conclusion IMHO.
>
>Note other selflearning experts have more criticism against TD learning which i
>do not share too much. Their criticism is that some stuff is hard coded, so the
>tuner can't go wrong there. For me that 'cheating' is a smart thing to do
>however, because it is clear that tuning without domain knowledge isn't going to
>work within a year or 100.
>

The DB guys didn't claim to do TD learning or any other _automated_ learning
whatsover.  They claimed to have an evaluation _tuning_ tool that did, in fact,
seem to work.

One problem is that when you change an eval term to correct one flaw, you can
introduce other bad behavior without knowing it.  They tries to solve this by
the least-squares summation over a bunch of positions so that you could
increase something that needed help without wrecking the program in positions
where it was already doing well.

The idea has (and still has) a lot of merit...  Just because nobody does it
today doesn't mean it is (a) bad, (b) impossible, or (c) anything else.



>In later years when hardware became faster, also evaluations became better
>without clear weak chains.
>
>Evaluations without clear weak chains are very hard to automatically tune.
>
>Basically tuners have no domain knowledge, so if you have a couple of thousands
>of patterns, not to mention the number of adjustable parameters, it will take
>more time than there are chess positions, to automatically tune them.
>
>And it is sad that the much praised TD learning, which completely sucked
>everywhere from objective perspective, is praised so much as a big step.
>
>Basically TD learning demonstrates that someone *did* do effort to implement TD
>learning and we can praise the person in question for doing that.
>
>Most 'learning' plans do not leave the paper ever.
>
>But having seen hundreds of games from knightcap i definitely learned that
>tuning without domain knowledge is really impossible.
>
>A result from those paper learning in AI world is next. Chessprograms improve
>and improve, but also get more complex. To list a few of the stuff programs
>might simulatenously have (without saying A sucks and B is good):
>  - alfabeta
>  - negamax
>  - quiescencesearch
>  - hashtables
>  - multiprobing
>  - complex datastructure
>  - nullmove
>  - possible forms of forward pruning
>  - killermoves
>  - move ordering
>  - SEE (qsearch, move ordering)
>  - futility
>  - lazy evaluation
>  - quick evaluation
>  - psq tables to order moves
>  - probcutoff
>  - reductions
>  - forward pruning (one of the many forms)
>  - iterative deepening
>  - internal iterative deepening (move ordering)
>  - fractional ply depth
>  - parallel search algorithms
>  - check extensions
>  - singular extensions
>  - mating extensions
>  - passed pawn extensions
>  - recapture extensions
>  - other extensions (so many the list is endless)
>
>This is just what i could type within 2 minutes. In short. All kind of
>algorithms and methods get combined to something complex and more complex and it
>is all 'integrated' somehow and some domain dependant; so requiring a lot of
>chess technical code some in order to work well.
>
>Because 99.9% of all tuning algorithms do not leave the paper, they usually can
>be described in a few lines of pseudo code. For that reason most are for 99.9%
>doing the same similar thing in the same way, but have a new cool name. Just
>like nullmove R=2 and R=3 is exactly the same algorithm (nullmove), but just a
>small implementation detail is different.
>
>Yet the simplicity of the AI-learning world is so small, most concepts are
>simply paper concepts which do not work in the real world.
>
>If they ever leave the paper then they perform a silly experiment or conclude
>things in the wrong way. Never objective science is getting done there.
>
>Those scientists hate programming simple. So the possibility to combine methods
>and especially combine them with domain dependant knowledge is like near zero.
>In that way TD learning is seen as a grown up algorithm.
>
>The good thing of it, is that it is doing something without crashing. The bad
>thing of it is that it left the paper so we could see how poor it worked.
>
>It would be too hard to say that the work has been done for nothing. In
>contradiction. It simply proofs how much paper work the AI is now and how people
>can believe in the unknown.
>
>I have met at least 100 students who wanted to make a selflearning thing. Either
>neural network, genetic algorithm etc.
>
>In fact in all those years only 1 thing i have seen working and that was a
>genetic algorithm finding a shorter path than my silly brute force search did
>(we talk about hundreds of points). That genetic algorithm however had domain
>dependant knowledge and was already helped with a short route to start with...
>
>Best regards,
>Vincent



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.