Author: Jim Bell
Date: 11:51:53 06/06/01
Go up one level in this thread
On June 06, 2001 at 12:40:11, Landon Rabern wrote: >On June 06, 2001 at 00:13:59, Jim Bell wrote: > >>On June 05, 2001 at 12:56:08, Landon Rabern wrote: >> >>>On June 05, 2001 at 08:21:16, Jim Bell wrote: >>> >>>>On June 04, 2001 at 19:00:55, Landon Rabern wrote: >>>> >>>>[SNIP] >>>>> >>>>>I have done some testing will a neural network evaluation in my program for my >>>>>independent study. The biggest problem I ran into was the slowness of >>>>>calculating all the sigmoids(I actually used tanh(NET)). It drastically cuts >>>>>down the nps and gets spanked by my handcrafted eval. I got moderate results >>>>>playing with set ply depths no set time controls, but that isn't saying much. >>>>> >>>>>Regards, >>>>> >>>>>Landon W. Rabern >>>> >>>>In case you are still interested, you might want to consider what I assume is a >>>>faster activation function: x/(1.0+|x|), where x is the total weighted input to >>>>a node. I read about it in a paper titled "A Better Activation Function for >>>>Artificial Neural Networks", by D.L. Elliott. I found a link to the paper (in >>>>PDF format) at: >>>> >>>> "http://www.isr.umd.edu/TechReports/ISR/1993/TR_93-8/TR_93-8.phtml" >>>> >>>>I should warn you that I am certainly no expert when it comes to neural >>>>networks, and I haven't seen this particular activation function used elsewhere, >>>>but it shouldn't be too difficult to replace the tanh(x), >>>>and see what happens. (Of course, you would also have to change the >>>>derivative function as well!) >>>> >>>>Jim >>> >>>Interesting, I will have to try this. The curve is not as smooth as the tanh, >>>but unlike the standard 1/(1+e^-x) it does output on -1,1. The derivative will >>>be something like 1/(1+x)^2 but then need to take into accoutn the absolute >>>value. I don't see a way off hand to use the original activation function to >>>produce the derivate quickly, but there must be a way. >>> >>>Regards, >>> >>>Landon W. Rabern >> >>As I recall, I tried a little experiment a couple of years ago in which I >>simulated a simple 3-layer feedforward neural network, using the standard >>y=1/(1+e^-x) squashing function, and then used the back-propagation algorithm to >>teach the network some simple input/output relationships. Then, I tried >>replacing the squashing function with y=x/(1+|x|), and instead of multiplying >>something (I don't remember what) by y(1-y), I multiplied it by (1-|y|)^2, >>because Elliott's paper has something like: >> >>y = 1/(1+e^-x), y' = (e^-x)/(1+e^-x)^2 = y(1-y) >>y = x/(1+|x|) , y' = 1/(1+|x|)^2 = (1-|y|)^2 >> >>If memory serves me, the modified program also worked correctly, but I don't >>remember how much faster (or perhaps slower??) the new program ran. I soon >>thereafter deleted the code and I haven't done anything since with neural >>networks. >> >>Jim > >OK, yes y'=(1-|y|)^2 works, cool. I will try it when I get a chance, but no >time now it is crunch time at work getting some products out, so working 80+ >hour weeks,they really like to work their interns :) > >Regards, > >Landon W. Rabern Let us know if you have any success with neural-nets/chess. I've tried a couple of small things myself, but I didn't have the patience nor skill to come up with anything promising. Maybe this summer I'll try again. BTW, I have a sister named Betsy! Good luck, Jim
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.