Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: deep blue's automatic tuning of evaluation function

Author: Vincent Diepeveen
Date: 07:43:22 03/24/03
On March 23, 2003 at 01:00:47, Robert Hyatt wrote:

bob i explained the TD learning here. not crafty. i didn't knock
against crafty at all.

i was talking about the many games knightcap-crafty to show why i find the
results drawn from the TD learning experiment are overreacted.

i could have said thebaron-knightcap as well, but it played many games against
crafty.

you focus too much upon the word crafty here. focus upon the original question
of the poster which is: "learning?"


>On March 22, 2003 at 15:10:52, Vincent Diepeveen wrote:
>
>>On March 22, 2003 at 07:21:26, emerson tan wrote:
>>
>>>I heard that deep blue uses automatic tuning for its evaluation function, does
>>>this mean that as it plays games against humans and computers, deep blue will
>>>self tune its evaluation function base on the results of its games against
>>>humans and computer? If it is, is it effective? Are the other programs using
>>>automatic tuning also?
>>
>>Many tried tuning automatic, but they all failed. Human tuning is way better and
>>more accurate. Big problems of autotuners is the number of experiments needed
>>before a good tuning is there.
>>
>>Basically you only can proof a good tuning by playing games against others.
>>
>>That will in short take another few thousands of years to tune a complex
>>evaluation.
>>
>>Doesn't take away that everyone has tried a bit in that direction and probably
>>will keep trying.
>>
>>Current algorithms simply do not work.
>>
>>Also the much praised TD learning is simply not working.
>>
>>What it did was overreact things in the long term. So for example at the time
>>that crafty was at a single cpu pro200 and those versions had a weak king safety
>>some years ago, it would find out that weakness of crafty not as we conclude it.
>>It just concluded that it could by sacraficing pieces and such towards crafty
>>king, that this would work a bit.
>
>
>
>Why don't you spend more time talking about _your_ program and less time
>knocking mine?  Crafty 1996 (pentium pro 200) did just as well against
>Diep 1996 as Crafty of today does against Diep of today.  If my king safety
>was weak in 1996 so was yours.
>
>Why don't you give up this particular path of insults?  It only makes you
>look idiotic.  That "weak" version of Crafty in 1996 finished in 4th place
>at the WMCCC event.  Where did yours finish that year?  In fact, I'd bet
>that KnightCap had a better record against _your_ program that it did
>against Crafty.  Which makes your comparison all the more silly.
>
>Again, you should spend more time working and less time knocking other
>programs.  Your program would get better, faster.
>
>
>
>>
>>but especially 'a bit' is important here. It of course would have been happy
>>scoring 20% score or so. So if you gamble 10 games in a row and win 2 in that
>>way and without a chance lose the others, then that might seem to work, but in
>>absolute terms you are doing a bad job, because scoring 20% sucks.
>>
>>Of course those were the days that in 5 0 or 5 3 blitz levels the programs got
>>very small search depths. Not seldom computer-computer games were tactically
>>dominated in these days around 1997.
>>
>>Concluding that pruning works based upon those slaughter matches (where the
>>knightcap stuff got butchered many games in a row then winning 1 game for it by
>>some aggressive sacrafice) is not the right conclusion IMHO.
>>
>>Note other selflearning experts have more criticism against TD learning which i
>>do not share too much. Their criticism is that some stuff is hard coded, so the
>>tuner can't go wrong there. For me that 'cheating' is a smart thing to do
>>however, because it is clear that tuning without domain knowledge isn't going to
>>work within a year or 100.
>>
>
>The DB guys didn't claim to do TD learning or any other _automated_ learning
>whatsover.  They claimed to have an evaluation _tuning_ tool that did, in fact,
>seem to work.
>
>One problem is that when you change an eval term to correct one flaw, you can
>introduce other bad behavior without knowing it.  They tries to solve this by
>the least-squares summation over a bunch of positions so that you could
>increase something that needed help without wrecking the program in positions
>where it was already doing well.
>
>The idea has (and still has) a lot of merit...  Just because nobody does it
>today doesn't mean it is (a) bad, (b) impossible, or (c) anything else.
>
>
>
>>In later years when hardware became faster, also evaluations became better
>>without clear weak chains.
>>
>>Evaluations without clear weak chains are very hard to automatically tune.
>>
>>Basically tuners have no domain knowledge, so if you have a couple of thousands
>>of patterns, not to mention the number of adjustable parameters, it will take
>>more time than there are chess positions, to automatically tune them.
>>
>>And it is sad that the much praised TD learning, which completely sucked
>>everywhere from objective perspective, is praised so much as a big step.
>>
>>Basically TD learning demonstrates that someone *did* do effort to implement TD
>>learning and we can praise the person in question for doing that.
>>
>>Most 'learning' plans do not leave the paper ever.
>>
>>But having seen hundreds of games from knightcap i definitely learned that
>>tuning without domain knowledge is really impossible.
>>
>>A result from those paper learning in AI world is next. Chessprograms improve
>>and improve, but also get more complex. To list a few of the stuff programs
>>might simulatenously have (without saying A sucks and B is good):
>>  - alfabeta
>>  - negamax
>>  - quiescencesearch
>>  - hashtables
>>  - multiprobing
>>  - complex datastructure
>>  - nullmove
>>  - possible forms of forward pruning
>>  - killermoves
>>  - move ordering
>>  - SEE (qsearch, move ordering)
>>  - futility
>>  - lazy evaluation
>>  - quick evaluation
>>  - psq tables to order moves
>>  - probcutoff
>>  - reductions
>>  - forward pruning (one of the many forms)
>>  - iterative deepening
>>  - internal iterative deepening (move ordering)
>>  - fractional ply depth
>>  - parallel search algorithms
>>  - check extensions
>>  - singular extensions
>>  - mating extensions
>>  - passed pawn extensions
>>  - recapture extensions
>>  - other extensions (so many the list is endless)
>>
>>This is just what i could type within 2 minutes. In short. All kind of
>>algorithms and methods get combined to something complex and more complex and it
>>is all 'integrated' somehow and some domain dependant; so requiring a lot of
>>chess technical code some in order to work well.
>>
>>Because 99.9% of all tuning algorithms do not leave the paper, they usually can
>>be described in a few lines of pseudo code. For that reason most are for 99.9%
>>doing the same similar thing in the same way, but have a new cool name. Just
>>like nullmove R=2 and R=3 is exactly the same algorithm (nullmove), but just a
>>small implementation detail is different.
>>
>>Yet the simplicity of the AI-learning world is so small, most concepts are
>>simply paper concepts which do not work in the real world.
>>
>>If they ever leave the paper then they perform a silly experiment or conclude
>>things in the wrong way. Never objective science is getting done there.
>>
>>Those scientists hate programming simple. So the possibility to combine methods
>>and especially combine them with domain dependant knowledge is like near zero.
>>In that way TD learning is seen as a grown up algorithm.
>>
>>The good thing of it, is that it is doing something without crashing. The bad
>>thing of it is that it left the paper so we could see how poor it worked.
>>
>>It would be too hard to say that the work has been done for nothing. In
>>contradiction. It simply proofs how much paper work the AI is now and how people
>>can believe in the unknown.
>>
>>I have met at least 100 students who wanted to make a selflearning thing. Either
>>neural network, genetic algorithm etc.
>>
>>In fact in all those years only 1 thing i have seen working and that was a
>>genetic algorithm finding a shorter path than my silly brute force search did
>>(we talk about hundreds of points). That genetic algorithm however had domain
>>dependant knowledge and was already helped with a short route to start with...
>>
>>Best regards,
>>Vincent
Re: deep blue's automatic tuning of evaluation function Robert Hyatt 08:34:53 03/24/03
- Re: deep blue's automatic tuning of evaluation function Vincent Diepeveen 14:10:59 03/24/03
  - Re: deep blue's automatic tuning of evaluation function Robert Hyatt 14:55:26 03/24/03
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.