Author: Vincent Diepeveen
Date: 07:43:22 03/24/03
Go up one level in this thread
On March 23, 2003 at 01:00:47, Robert Hyatt wrote: bob i explained the TD learning here. not crafty. i didn't knock against crafty at all. i was talking about the many games knightcap-crafty to show why i find the results drawn from the TD learning experiment are overreacted. i could have said thebaron-knightcap as well, but it played many games against crafty. you focus too much upon the word crafty here. focus upon the original question of the poster which is: "learning?" >On March 22, 2003 at 15:10:52, Vincent Diepeveen wrote: > >>On March 22, 2003 at 07:21:26, emerson tan wrote: >> >>>I heard that deep blue uses automatic tuning for its evaluation function, does >>>this mean that as it plays games against humans and computers, deep blue will >>>self tune its evaluation function base on the results of its games against >>>humans and computer? If it is, is it effective? Are the other programs using >>>automatic tuning also? >> >>Many tried tuning automatic, but they all failed. Human tuning is way better and >>more accurate. Big problems of autotuners is the number of experiments needed >>before a good tuning is there. >> >>Basically you only can proof a good tuning by playing games against others. >> >>That will in short take another few thousands of years to tune a complex >>evaluation. >> >>Doesn't take away that everyone has tried a bit in that direction and probably >>will keep trying. >> >>Current algorithms simply do not work. >> >>Also the much praised TD learning is simply not working. >> >>What it did was overreact things in the long term. So for example at the time >>that crafty was at a single cpu pro200 and those versions had a weak king safety >>some years ago, it would find out that weakness of crafty not as we conclude it. >>It just concluded that it could by sacraficing pieces and such towards crafty >>king, that this would work a bit. > > > >Why don't you spend more time talking about _your_ program and less time >knocking mine? Crafty 1996 (pentium pro 200) did just as well against >Diep 1996 as Crafty of today does against Diep of today. If my king safety >was weak in 1996 so was yours. > >Why don't you give up this particular path of insults? It only makes you >look idiotic. That "weak" version of Crafty in 1996 finished in 4th place >at the WMCCC event. Where did yours finish that year? In fact, I'd bet >that KnightCap had a better record against _your_ program that it did >against Crafty. Which makes your comparison all the more silly. > >Again, you should spend more time working and less time knocking other >programs. Your program would get better, faster. > > > >> >>but especially 'a bit' is important here. It of course would have been happy >>scoring 20% score or so. So if you gamble 10 games in a row and win 2 in that >>way and without a chance lose the others, then that might seem to work, but in >>absolute terms you are doing a bad job, because scoring 20% sucks. >> >>Of course those were the days that in 5 0 or 5 3 blitz levels the programs got >>very small search depths. Not seldom computer-computer games were tactically >>dominated in these days around 1997. >> >>Concluding that pruning works based upon those slaughter matches (where the >>knightcap stuff got butchered many games in a row then winning 1 game for it by >>some aggressive sacrafice) is not the right conclusion IMHO. >> >>Note other selflearning experts have more criticism against TD learning which i >>do not share too much. Their criticism is that some stuff is hard coded, so the >>tuner can't go wrong there. For me that 'cheating' is a smart thing to do >>however, because it is clear that tuning without domain knowledge isn't going to >>work within a year or 100. >> > >The DB guys didn't claim to do TD learning or any other _automated_ learning >whatsover. They claimed to have an evaluation _tuning_ tool that did, in fact, >seem to work. > >One problem is that when you change an eval term to correct one flaw, you can >introduce other bad behavior without knowing it. They tries to solve this by >the least-squares summation over a bunch of positions so that you could >increase something that needed help without wrecking the program in positions >where it was already doing well. > >The idea has (and still has) a lot of merit... Just because nobody does it >today doesn't mean it is (a) bad, (b) impossible, or (c) anything else. > > > >>In later years when hardware became faster, also evaluations became better >>without clear weak chains. >> >>Evaluations without clear weak chains are very hard to automatically tune. >> >>Basically tuners have no domain knowledge, so if you have a couple of thousands >>of patterns, not to mention the number of adjustable parameters, it will take >>more time than there are chess positions, to automatically tune them. >> >>And it is sad that the much praised TD learning, which completely sucked >>everywhere from objective perspective, is praised so much as a big step. >> >>Basically TD learning demonstrates that someone *did* do effort to implement TD >>learning and we can praise the person in question for doing that. >> >>Most 'learning' plans do not leave the paper ever. >> >>But having seen hundreds of games from knightcap i definitely learned that >>tuning without domain knowledge is really impossible. >> >>A result from those paper learning in AI world is next. Chessprograms improve >>and improve, but also get more complex. To list a few of the stuff programs >>might simulatenously have (without saying A sucks and B is good): >> - alfabeta >> - negamax >> - quiescencesearch >> - hashtables >> - multiprobing >> - complex datastructure >> - nullmove >> - possible forms of forward pruning >> - killermoves >> - move ordering >> - SEE (qsearch, move ordering) >> - futility >> - lazy evaluation >> - quick evaluation >> - psq tables to order moves >> - probcutoff >> - reductions >> - forward pruning (one of the many forms) >> - iterative deepening >> - internal iterative deepening (move ordering) >> - fractional ply depth >> - parallel search algorithms >> - check extensions >> - singular extensions >> - mating extensions >> - passed pawn extensions >> - recapture extensions >> - other extensions (so many the list is endless) >> >>This is just what i could type within 2 minutes. In short. All kind of >>algorithms and methods get combined to something complex and more complex and it >>is all 'integrated' somehow and some domain dependant; so requiring a lot of >>chess technical code some in order to work well. >> >>Because 99.9% of all tuning algorithms do not leave the paper, they usually can >>be described in a few lines of pseudo code. For that reason most are for 99.9% >>doing the same similar thing in the same way, but have a new cool name. Just >>like nullmove R=2 and R=3 is exactly the same algorithm (nullmove), but just a >>small implementation detail is different. >> >>Yet the simplicity of the AI-learning world is so small, most concepts are >>simply paper concepts which do not work in the real world. >> >>If they ever leave the paper then they perform a silly experiment or conclude >>things in the wrong way. Never objective science is getting done there. >> >>Those scientists hate programming simple. So the possibility to combine methods >>and especially combine them with domain dependant knowledge is like near zero. >>In that way TD learning is seen as a grown up algorithm. >> >>The good thing of it, is that it is doing something without crashing. The bad >>thing of it is that it left the paper so we could see how poor it worked. >> >>It would be too hard to say that the work has been done for nothing. In >>contradiction. It simply proofs how much paper work the AI is now and how people >>can believe in the unknown. >> >>I have met at least 100 students who wanted to make a selflearning thing. Either >>neural network, genetic algorithm etc. >> >>In fact in all those years only 1 thing i have seen working and that was a >>genetic algorithm finding a shorter path than my silly brute force search did >>(we talk about hundreds of points). That genetic algorithm however had domain >>dependant knowledge and was already helped with a short route to start with... >> >>Best regards, >>Vincent
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.