Author: Landon Rabern
Date: 09:40:11 06/06/01
Go up one level in this thread
On June 06, 2001 at 00:13:59, Jim Bell wrote: >On June 05, 2001 at 12:56:08, Landon Rabern wrote: > >>On June 05, 2001 at 08:21:16, Jim Bell wrote: >> >>>On June 04, 2001 at 19:00:55, Landon Rabern wrote: >>> >>>[SNIP] >>>> >>>>I have done some testing will a neural network evaluation in my program for my >>>>independent study. The biggest problem I ran into was the slowness of >>>>calculating all the sigmoids(I actually used tanh(NET)). It drastically cuts >>>>down the nps and gets spanked by my handcrafted eval. I got moderate results >>>>playing with set ply depths no set time controls, but that isn't saying much. >>>> >>>>Regards, >>>> >>>>Landon W. Rabern >>> >>>In case you are still interested, you might want to consider what I assume is a >>>faster activation function: x/(1.0+|x|), where x is the total weighted input to >>>a node. I read about it in a paper titled "A Better Activation Function for >>>Artificial Neural Networks", by D.L. Elliott. I found a link to the paper (in >>>PDF format) at: >>> >>> "http://www.isr.umd.edu/TechReports/ISR/1993/TR_93-8/TR_93-8.phtml" >>> >>>I should warn you that I am certainly no expert when it comes to neural >>>networks, and I haven't seen this particular activation function used elsewhere, >>>but it shouldn't be too difficult to replace the tanh(x), >>>and see what happens. (Of course, you would also have to change the >>>derivative function as well!) >>> >>>Jim >> >>Interesting, I will have to try this. The curve is not as smooth as the tanh, >>but unlike the standard 1/(1+e^-x) it does output on -1,1. The derivative will >>be something like 1/(1+x)^2 but then need to take into accoutn the absolute >>value. I don't see a way off hand to use the original activation function to >>produce the derivate quickly, but there must be a way. >> >>Regards, >> >>Landon W. Rabern > >As I recall, I tried a little experiment a couple of years ago in which I >simulated a simple 3-layer feedforward neural network, using the standard >y=1/(1+e^-x) squashing function, and then used the back-propagation algorithm to >teach the network some simple input/output relationships. Then, I tried >replacing the squashing function with y=x/(1+|x|), and instead of multiplying >something (I don't remember what) by y(1-y), I multiplied it by (1-|y|)^2, >because Elliott's paper has something like: > >y = 1/(1+e^-x), y' = (e^-x)/(1+e^-x)^2 = y(1-y) >y = x/(1+|x|) , y' = 1/(1+|x|)^2 = (1-|y|)^2 > >If memory serves me, the modified program also worked correctly, but I don't >remember how much faster (or perhaps slower??) the new program ran. I soon >thereafter deleted the code and I haven't done anything since with neural >networks. > >Jim OK, yes y'=(1-|y|)^2 works, cool. I will try it when I get a chance, but no time now it is crunch time at work getting some products out, so working 80+ hour weeks,they really like to work their interns :) Regards, Landon W. Rabern
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.