Author: Robert Hyatt
Date: 08:34:53 03/24/03
Go up one level in this thread
On March 24, 2003 at 10:43:22, Vincent Diepeveen wrote: >On March 23, 2003 at 01:00:47, Robert Hyatt wrote: > >bob i explained the TD learning here. not crafty. i didn't knock >against crafty at all. Just look at your reference to crafty. Unnecessary to even name an opponent. > >i was talking about the many games knightcap-crafty to show why i find the >results drawn from the TD learning experiment are overreacted. This is yet another "it is impossible because I can't see how to make it work" type of discussion? _nothing_ sayd TD won't work. It _hasn't_ worked real well, so far, but then again full-width search didn't work in 1970 either. But it does now. > >i could have said thebaron-knightcap as well, but it played many games against >crafty. > >you focus too much upon the word crafty here. focus upon the original question >of the poster which is: "learning?" > > >>On March 22, 2003 at 15:10:52, Vincent Diepeveen wrote: >> >>>On March 22, 2003 at 07:21:26, emerson tan wrote: >>> >>>>I heard that deep blue uses automatic tuning for its evaluation function, does >>>>this mean that as it plays games against humans and computers, deep blue will >>>>self tune its evaluation function base on the results of its games against >>>>humans and computer? If it is, is it effective? Are the other programs using >>>>automatic tuning also? >>> >>>Many tried tuning automatic, but they all failed. Human tuning is way better and >>>more accurate. Big problems of autotuners is the number of experiments needed >>>before a good tuning is there. >>> >>>Basically you only can proof a good tuning by playing games against others. >>> >>>That will in short take another few thousands of years to tune a complex >>>evaluation. >>> >>>Doesn't take away that everyone has tried a bit in that direction and probably >>>will keep trying. >>> >>>Current algorithms simply do not work. >>> >>>Also the much praised TD learning is simply not working. >>> >>>What it did was overreact things in the long term. So for example at the time >>>that crafty was at a single cpu pro200 and those versions had a weak king safety >>>some years ago, it would find out that weakness of crafty not as we conclude it. >>>It just concluded that it could by sacraficing pieces and such towards crafty >>>king, that this would work a bit. >> >> >> >>Why don't you spend more time talking about _your_ program and less time >>knocking mine? Crafty 1996 (pentium pro 200) did just as well against >>Diep 1996 as Crafty of today does against Diep of today. If my king safety >>was weak in 1996 so was yours. >> >>Why don't you give up this particular path of insults? It only makes you >>look idiotic. That "weak" version of Crafty in 1996 finished in 4th place >>at the WMCCC event. Where did yours finish that year? In fact, I'd bet >>that KnightCap had a better record against _your_ program that it did >>against Crafty. Which makes your comparison all the more silly. >> >>Again, you should spend more time working and less time knocking other >>programs. Your program would get better, faster. >> >> >> >>> >>>but especially 'a bit' is important here. It of course would have been happy >>>scoring 20% score or so. So if you gamble 10 games in a row and win 2 in that >>>way and without a chance lose the others, then that might seem to work, but in >>>absolute terms you are doing a bad job, because scoring 20% sucks. >>> >>>Of course those were the days that in 5 0 or 5 3 blitz levels the programs got >>>very small search depths. Not seldom computer-computer games were tactically >>>dominated in these days around 1997. >>> >>>Concluding that pruning works based upon those slaughter matches (where the >>>knightcap stuff got butchered many games in a row then winning 1 game for it by >>>some aggressive sacrafice) is not the right conclusion IMHO. >>> >>>Note other selflearning experts have more criticism against TD learning which i >>>do not share too much. Their criticism is that some stuff is hard coded, so the >>>tuner can't go wrong there. For me that 'cheating' is a smart thing to do >>>however, because it is clear that tuning without domain knowledge isn't going to >>>work within a year or 100. >>> >> >>The DB guys didn't claim to do TD learning or any other _automated_ learning >>whatsover. They claimed to have an evaluation _tuning_ tool that did, in fact, >>seem to work. >> >>One problem is that when you change an eval term to correct one flaw, you can >>introduce other bad behavior without knowing it. They tries to solve this by >>the least-squares summation over a bunch of positions so that you could >>increase something that needed help without wrecking the program in positions >>where it was already doing well. >> >>The idea has (and still has) a lot of merit... Just because nobody does it >>today doesn't mean it is (a) bad, (b) impossible, or (c) anything else. >> >> >> >>>In later years when hardware became faster, also evaluations became better >>>without clear weak chains. >>> >>>Evaluations without clear weak chains are very hard to automatically tune. >>> >>>Basically tuners have no domain knowledge, so if you have a couple of thousands >>>of patterns, not to mention the number of adjustable parameters, it will take >>>more time than there are chess positions, to automatically tune them. >>> >>>And it is sad that the much praised TD learning, which completely sucked >>>everywhere from objective perspective, is praised so much as a big step. >>> >>>Basically TD learning demonstrates that someone *did* do effort to implement TD >>>learning and we can praise the person in question for doing that. >>> >>>Most 'learning' plans do not leave the paper ever. >>> >>>But having seen hundreds of games from knightcap i definitely learned that >>>tuning without domain knowledge is really impossible. >>> >>>A result from those paper learning in AI world is next. Chessprograms improve >>>and improve, but also get more complex. To list a few of the stuff programs >>>might simulatenously have (without saying A sucks and B is good): >>> - alfabeta >>> - negamax >>> - quiescencesearch >>> - hashtables >>> - multiprobing >>> - complex datastructure >>> - nullmove >>> - possible forms of forward pruning >>> - killermoves >>> - move ordering >>> - SEE (qsearch, move ordering) >>> - futility >>> - lazy evaluation >>> - quick evaluation >>> - psq tables to order moves >>> - probcutoff >>> - reductions >>> - forward pruning (one of the many forms) >>> - iterative deepening >>> - internal iterative deepening (move ordering) >>> - fractional ply depth >>> - parallel search algorithms >>> - check extensions >>> - singular extensions >>> - mating extensions >>> - passed pawn extensions >>> - recapture extensions >>> - other extensions (so many the list is endless) >>> >>>This is just what i could type within 2 minutes. In short. All kind of >>>algorithms and methods get combined to something complex and more complex and it >>>is all 'integrated' somehow and some domain dependant; so requiring a lot of >>>chess technical code some in order to work well. >>> >>>Because 99.9% of all tuning algorithms do not leave the paper, they usually can >>>be described in a few lines of pseudo code. For that reason most are for 99.9% >>>doing the same similar thing in the same way, but have a new cool name. Just >>>like nullmove R=2 and R=3 is exactly the same algorithm (nullmove), but just a >>>small implementation detail is different. >>> >>>Yet the simplicity of the AI-learning world is so small, most concepts are >>>simply paper concepts which do not work in the real world. >>> >>>If they ever leave the paper then they perform a silly experiment or conclude >>>things in the wrong way. Never objective science is getting done there. >>> >>>Those scientists hate programming simple. So the possibility to combine methods >>>and especially combine them with domain dependant knowledge is like near zero. >>>In that way TD learning is seen as a grown up algorithm. >>> >>>The good thing of it, is that it is doing something without crashing. The bad >>>thing of it is that it left the paper so we could see how poor it worked. >>> >>>It would be too hard to say that the work has been done for nothing. In >>>contradiction. It simply proofs how much paper work the AI is now and how people >>>can believe in the unknown. >>> >>>I have met at least 100 students who wanted to make a selflearning thing. Either >>>neural network, genetic algorithm etc. >>> >>>In fact in all those years only 1 thing i have seen working and that was a >>>genetic algorithm finding a shorter path than my silly brute force search did >>>(we talk about hundreds of points). That genetic algorithm however had domain >>>dependant knowledge and was already helped with a short route to start with... >>> >>>Best regards, >>>Vincent
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.