Author: Dann Corbit
Date: 16:28:24 10/26/98
Go up one level in this thread
On October 26, 1998 at 18:40:59, Fernando Villegas wrote: [snip] >If I did not understand bad, this system means Rebel will not play anymore a >line not because this line is flawed, BUT because he was not capable of getting >good results with it. If that is the thing, then Rebel is not learning to play >better, but just avoiding paths where he misshandles the game. This is more a >neurotic behaviour than a learning one. So, in the long run, what we'll have >will be not a more knowleadgable program ,but a more restricted one, a narrow >minded program stuck just with the liones he plays well qwith his actuial >programming. If there were flaws that resulted in narrow lines, and these lines were not good ones, then it would start to get beat on these lines by good players and/or good programs. Since these lines would start to lose, now they would be downgraded. Hence, this technique heals itself. One thing that we should be careful of is unnecessary 'avoid move' marking. This might prevent learning valuable strategies (IMO). Similarly, marking something as 'best move' should really be validated by a GM. Otherwise, it should only be suggested among alternatives. >I think -although I know there is a great abysm between words and >implementation- that real learning should mean some kind of changes within the >source code. That is a program that does not learn, unless you are talking about self modifying code. Such learning is not needed. Information, after all, is *data* not code. For NN type learning, I think the flaw is that computers are not aware of temporal sequence. In other words, the net learns that it is good to develop the queen. But instead of waiting for a good formation, it thrusts the queen out right away. It did not know _when_ such a thing was advisable. Perhaps a NN should be combined with a FSM in some way. But modifying the code to make the program smarter is how most people do it now. It is called an upgrade, usually. Trying to do this 'on the fly' has some serious disadvantages. First of all, self modifying code is very hard to debug and maintain. Secondly, it is hard to make it efficient. Sort of like a virtual function and we don't even know what the function is going to do for sure! Sounds like we will have to do lots of inquiry (like RTTI, only worse!) I plan to use the SQL database in the project that I am working on to make Crafty learn. It will have not only the centipawn eval, but also win/loss/draw data and other statistics. If it cannot find the data, or the data is not as strong as it wants to find, it will generate it as normal. But after recalculation, it will store the new information to disk. This is a program that really does learn as it plays. And don't worry about terabytes of disk. It can only write out data as fast as it can play. But I might create some kind of central database repository that thousands of crafties all over the world could connect to. That way, all of them benefit from the learning of each other.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.