Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: deep blue's automatic tuning of evaluation function

Author: Robert Hyatt

Date: 08:34:53 03/24/03

Go up one level in this thread


On March 24, 2003 at 10:43:22, Vincent Diepeveen wrote:

>On March 23, 2003 at 01:00:47, Robert Hyatt wrote:
>
>bob i explained the TD learning here. not crafty. i didn't knock
>against crafty at all.

Just look at your reference to crafty.  Unnecessary to even name an opponent.


>
>i was talking about the many games knightcap-crafty to show why i find the
>results drawn from the TD learning experiment are overreacted.

This is yet another "it is impossible because I can't see how to make it work"
type
of discussion?  _nothing_ sayd TD won't work.  It _hasn't_ worked real well, so
far, but then again full-width search didn't work in 1970 either.  But it does
now.




>
>i could have said thebaron-knightcap as well, but it played many games against
>crafty.
>
>you focus too much upon the word crafty here. focus upon the original question
>of the poster which is: "learning?"
>
>
>>On March 22, 2003 at 15:10:52, Vincent Diepeveen wrote:
>>
>>>On March 22, 2003 at 07:21:26, emerson tan wrote:
>>>
>>>>I heard that deep blue uses automatic tuning for its evaluation function, does
>>>>this mean that as it plays games against humans and computers, deep blue will
>>>>self tune its evaluation function base on the results of its games against
>>>>humans and computer? If it is, is it effective? Are the other programs using
>>>>automatic tuning also?
>>>
>>>Many tried tuning automatic, but they all failed. Human tuning is way better and
>>>more accurate. Big problems of autotuners is the number of experiments needed
>>>before a good tuning is there.
>>>
>>>Basically you only can proof a good tuning by playing games against others.
>>>
>>>That will in short take another few thousands of years to tune a complex
>>>evaluation.
>>>
>>>Doesn't take away that everyone has tried a bit in that direction and probably
>>>will keep trying.
>>>
>>>Current algorithms simply do not work.
>>>
>>>Also the much praised TD learning is simply not working.
>>>
>>>What it did was overreact things in the long term. So for example at the time
>>>that crafty was at a single cpu pro200 and those versions had a weak king safety
>>>some years ago, it would find out that weakness of crafty not as we conclude it.
>>>It just concluded that it could by sacraficing pieces and such towards crafty
>>>king, that this would work a bit.
>>
>>
>>
>>Why don't you spend more time talking about _your_ program and less time
>>knocking mine?  Crafty 1996 (pentium pro 200) did just as well against
>>Diep 1996 as Crafty of today does against Diep of today.  If my king safety
>>was weak in 1996 so was yours.
>>
>>Why don't you give up this particular path of insults?  It only makes you
>>look idiotic.  That "weak" version of Crafty in 1996 finished in 4th place
>>at the WMCCC event.  Where did yours finish that year?  In fact, I'd bet
>>that KnightCap had a better record against _your_ program that it did
>>against Crafty.  Which makes your comparison all the more silly.
>>
>>Again, you should spend more time working and less time knocking other
>>programs.  Your program would get better, faster.
>>
>>
>>
>>>
>>>but especially 'a bit' is important here. It of course would have been happy
>>>scoring 20% score or so. So if you gamble 10 games in a row and win 2 in that
>>>way and without a chance lose the others, then that might seem to work, but in
>>>absolute terms you are doing a bad job, because scoring 20% sucks.
>>>
>>>Of course those were the days that in 5 0 or 5 3 blitz levels the programs got
>>>very small search depths. Not seldom computer-computer games were tactically
>>>dominated in these days around 1997.
>>>
>>>Concluding that pruning works based upon those slaughter matches (where the
>>>knightcap stuff got butchered many games in a row then winning 1 game for it by
>>>some aggressive sacrafice) is not the right conclusion IMHO.
>>>
>>>Note other selflearning experts have more criticism against TD learning which i
>>>do not share too much. Their criticism is that some stuff is hard coded, so the
>>>tuner can't go wrong there. For me that 'cheating' is a smart thing to do
>>>however, because it is clear that tuning without domain knowledge isn't going to
>>>work within a year or 100.
>>>
>>
>>The DB guys didn't claim to do TD learning or any other _automated_ learning
>>whatsover.  They claimed to have an evaluation _tuning_ tool that did, in fact,
>>seem to work.
>>
>>One problem is that when you change an eval term to correct one flaw, you can
>>introduce other bad behavior without knowing it.  They tries to solve this by
>>the least-squares summation over a bunch of positions so that you could
>>increase something that needed help without wrecking the program in positions
>>where it was already doing well.
>>
>>The idea has (and still has) a lot of merit...  Just because nobody does it
>>today doesn't mean it is (a) bad, (b) impossible, or (c) anything else.
>>
>>
>>
>>>In later years when hardware became faster, also evaluations became better
>>>without clear weak chains.
>>>
>>>Evaluations without clear weak chains are very hard to automatically tune.
>>>
>>>Basically tuners have no domain knowledge, so if you have a couple of thousands
>>>of patterns, not to mention the number of adjustable parameters, it will take
>>>more time than there are chess positions, to automatically tune them.
>>>
>>>And it is sad that the much praised TD learning, which completely sucked
>>>everywhere from objective perspective, is praised so much as a big step.
>>>
>>>Basically TD learning demonstrates that someone *did* do effort to implement TD
>>>learning and we can praise the person in question for doing that.
>>>
>>>Most 'learning' plans do not leave the paper ever.
>>>
>>>But having seen hundreds of games from knightcap i definitely learned that
>>>tuning without domain knowledge is really impossible.
>>>
>>>A result from those paper learning in AI world is next. Chessprograms improve
>>>and improve, but also get more complex. To list a few of the stuff programs
>>>might simulatenously have (without saying A sucks and B is good):
>>>  - alfabeta
>>>  - negamax
>>>  - quiescencesearch
>>>  - hashtables
>>>  - multiprobing
>>>  - complex datastructure
>>>  - nullmove
>>>  - possible forms of forward pruning
>>>  - killermoves
>>>  - move ordering
>>>  - SEE (qsearch, move ordering)
>>>  - futility
>>>  - lazy evaluation
>>>  - quick evaluation
>>>  - psq tables to order moves
>>>  - probcutoff
>>>  - reductions
>>>  - forward pruning (one of the many forms)
>>>  - iterative deepening
>>>  - internal iterative deepening (move ordering)
>>>  - fractional ply depth
>>>  - parallel search algorithms
>>>  - check extensions
>>>  - singular extensions
>>>  - mating extensions
>>>  - passed pawn extensions
>>>  - recapture extensions
>>>  - other extensions (so many the list is endless)
>>>
>>>This is just what i could type within 2 minutes. In short. All kind of
>>>algorithms and methods get combined to something complex and more complex and it
>>>is all 'integrated' somehow and some domain dependant; so requiring a lot of
>>>chess technical code some in order to work well.
>>>
>>>Because 99.9% of all tuning algorithms do not leave the paper, they usually can
>>>be described in a few lines of pseudo code. For that reason most are for 99.9%
>>>doing the same similar thing in the same way, but have a new cool name. Just
>>>like nullmove R=2 and R=3 is exactly the same algorithm (nullmove), but just a
>>>small implementation detail is different.
>>>
>>>Yet the simplicity of the AI-learning world is so small, most concepts are
>>>simply paper concepts which do not work in the real world.
>>>
>>>If they ever leave the paper then they perform a silly experiment or conclude
>>>things in the wrong way. Never objective science is getting done there.
>>>
>>>Those scientists hate programming simple. So the possibility to combine methods
>>>and especially combine them with domain dependant knowledge is like near zero.
>>>In that way TD learning is seen as a grown up algorithm.
>>>
>>>The good thing of it, is that it is doing something without crashing. The bad
>>>thing of it is that it left the paper so we could see how poor it worked.
>>>
>>>It would be too hard to say that the work has been done for nothing. In
>>>contradiction. It simply proofs how much paper work the AI is now and how people
>>>can believe in the unknown.
>>>
>>>I have met at least 100 students who wanted to make a selflearning thing. Either
>>>neural network, genetic algorithm etc.
>>>
>>>In fact in all those years only 1 thing i have seen working and that was a
>>>genetic algorithm finding a shorter path than my silly brute force search did
>>>(we talk about hundreds of points). That genetic algorithm however had domain
>>>dependant knowledge and was already helped with a short route to start with...
>>>
>>>Best regards,
>>>Vincent



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.