Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: a faster neural-network activation function

Author: Jim Bell

Date: 11:51:53 06/06/01

Go up one level in this thread


On June 06, 2001 at 12:40:11, Landon Rabern wrote:

>On June 06, 2001 at 00:13:59, Jim Bell wrote:
>
>>On June 05, 2001 at 12:56:08, Landon Rabern wrote:
>>
>>>On June 05, 2001 at 08:21:16, Jim Bell wrote:
>>>
>>>>On June 04, 2001 at 19:00:55, Landon Rabern wrote:
>>>>
>>>>[SNIP]
>>>>>
>>>>>I have done some testing will a neural network evaluation in my program for my
>>>>>independent study.  The biggest problem I ran into was the slowness of
>>>>>calculating all the sigmoids(I actually used tanh(NET)).  It drastically cuts
>>>>>down the nps and gets spanked by my handcrafted eval.  I got moderate results
>>>>>playing with set ply depths no set time controls, but that isn't saying much.
>>>>>
>>>>>Regards,
>>>>>
>>>>>Landon W. Rabern
>>>>
>>>>In case you are still interested, you might want to consider what I assume is a
>>>>faster activation function: x/(1.0+|x|), where x is the total weighted input to
>>>>a node. I read about it in a paper titled "A Better Activation Function for
>>>>Artificial Neural Networks", by D.L. Elliott.  I found a link to the paper (in
>>>>PDF format) at:
>>>>
>>>>   "http://www.isr.umd.edu/TechReports/ISR/1993/TR_93-8/TR_93-8.phtml"
>>>>
>>>>I should warn you that I am certainly no expert when it comes to neural
>>>>networks, and I haven't seen this particular activation function used elsewhere,
>>>>but it shouldn't be too difficult to replace the tanh(x),
>>>>and see what happens. (Of course, you would also have to change the
>>>>derivative function as well!)
>>>>
>>>>Jim
>>>
>>>Interesting, I will have to try this.  The curve is not as smooth as the tanh,
>>>but unlike the standard 1/(1+e^-x) it does output on -1,1.  The derivative will
>>>be something like 1/(1+x)^2 but then need to take into accoutn the absolute
>>>value.  I don't see a way off hand to use the original activation function to
>>>produce the derivate quickly, but there must be a way.
>>>
>>>Regards,
>>>
>>>Landon W. Rabern
>>
>>As I recall, I tried a little experiment a couple of years ago in which I
>>simulated a simple 3-layer feedforward neural network, using the standard
>>y=1/(1+e^-x) squashing function, and then used the back-propagation algorithm to
>>teach the network some simple input/output relationships. Then, I tried
>>replacing the squashing function with y=x/(1+|x|), and instead of multiplying
>>something (I don't remember what) by y(1-y), I multiplied it by (1-|y|)^2,
>>because Elliott's paper has something like:
>>
>>y = 1/(1+e^-x), y' = (e^-x)/(1+e^-x)^2 = y(1-y)
>>y = x/(1+|x|) , y' = 1/(1+|x|)^2 = (1-|y|)^2
>>
>>If memory serves me, the modified program also worked correctly, but I don't
>>remember how much faster (or perhaps slower??) the new program ran. I soon
>>thereafter deleted the code and I haven't done anything since with neural
>>networks.
>>
>>Jim
>
>OK, yes y'=(1-|y|)^2 works, cool.  I will try it when I get a chance, but no
>time now it is crunch time at work getting some products out, so working 80+
>hour weeks,they really like to work their interns :)
>
>Regards,
>
>Landon W. Rabern

Let us know if you have any success with neural-nets/chess. I've tried a couple
of small things myself, but I didn't have the patience nor skill to come up with
anything promising. Maybe this summer I'll try again. BTW, I have a sister
named Betsy!

Good luck,

Jim



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.