Computer Chess Club Archives




Subject: Re: chess and neural networks

Author: Christophe Theron

Date: 19:58:51 07/05/03

Go up one level in this thread

On July 05, 2003 at 13:55:12, Vincent Diepeveen wrote:

>On July 05, 2003 at 00:25:24, Christophe Theron wrote:
>>On July 04, 2003 at 23:56:34, Vincent Diepeveen wrote:
>>>On July 04, 2003 at 11:32:03, Christophe Theron wrote:
>>>>On July 03, 2003 at 15:44:44, Landon Rabern wrote:
>>>>>On July 03, 2003 at 03:22:15, Christophe Theron wrote:
>>>>>>On July 02, 2003 at 13:13:43, Landon Rabern wrote:
>>>>>>>On July 02, 2003 at 02:18:48, Dann Corbit wrote:
>>>>>>>>On July 02, 2003 at 02:03:20, Landon Rabern wrote:
>>>>>>>>>I made an attempt to use a NN for determining extensions and reductions.  It was
>>>>>>>>>evolved using a GA, kinda worked, but I ran out of time. to work on it at the
>>>>>>>>>end of school and don't have my computer anymore. The problem is that the NN is
>>>>>>>>>SLOW, even using x/(1+|x|) for activation instead of tanh(x).
>>>>>>>>Precompute a hyperbolic tangent table and store it in an array.  Speeds it up a
>>>>>>>Well, x/(1+|x|) is as fast or faster than a large table lookup.  The slowdown
>>>>>>>was from all the looping necessary for the feedforward.
>>>>>>A stupid question maybe, but I'm very interested by this stuff:
>>>>>>Do you really need a lot of accuracy for the "activation function"? Would it be
>>>>>>possible to consider a 256 values output for example?
>>>>>>Would the lack of accuracy hurt?
>>>>>>I'm not sure, but it seems to me that biological neurons do not need a lot of
>>>>>>accuracy in their output, and even worse: they are noisy. So I wonder if low
>>>>>>accuracy would be enough.
>>>>>There are neural net models that work with only binary output.  If the total
>>>>>input value exceeds some threshhold then you get a 1 otherwise a 0.  The problem
>>>>>is with training them by back prop.  But in this case I was using a Genetic Alg,
>>>>>so no back prop at all - so no problem.  I might work, but I don't see the
>>>>>benefit - were you thinking for speed?  The x/(1+|x|) is pretty fast to
>>>>>calculate, but perhaps the binary (or other discrete) would be faster.
>>>>>Something to try.
>>>>Yes, what I had in mind was optimization by using integer arithmetic only.
>>>>If the output is always on 8 bits, the sigma(W*I) (weight*input) can be computed
>>>>on 32 bits (each W*I will have at most 16 bits).
>>>>Actually sigma(W*I) will have no more than 20 bits if each neuron has at most 16
>>>>inputs. 32 bits allows for 65536 input per neuron.
>>>>This -maybe- allows for a fast table lookup of the atan function that I see used
>>>>often in ANN. I think it can be a little faster than x/(1+|x|) computed using
>>>>floating point arithmetic. Also, and this is even more important, the sigma(W*I)
>>>>would use integer arithmetic instead of floating point.
>>>>Maybe I should just do a Google search for this, I'm sure I'm not the first one
>>>>to think about this optimzation.
>>>I'm actually sure you are the first to find this optimization!
>>>The reason is that the average AI scientist never is doing many practical
>>>experiments with ANNs. Basically practical researchers outside ANNs are doing
>>>somteimes a few experiments like you and i. Further from that very tiny
>>>percentile researchers that sometimes do ANN experiments the average solution
>>>they come up with when they need to calculate it faster is either ask some
>>>system time of a supercomputer, or more likely fill a sporthal of their own
>>>university department with PC's, then they lay down a few network cables under
>>>the floor and they run a few carefully selected benchmarks showing their beowulf
>>>cluster really is a great thing to have.
>>>When a year or 10 ago some dude was looking around for funding for either a
>>>supercomputer or to buy hardware to speedup his neural networking software which
>>>was using quite some neurons, then i have translated his quickbasic program into
>>>C and optimized its speed by writing out stuff and finding clever loops within
>>>it that lossless speeded it up.
>>>In total i managed to speedup his software around a factor 1000 after 7 days of
>>>hard work (remember i started with 100KB quickbasic code and ended up with about
>>>20KB C code. note that the quickbasic used was the compilerversion not an
>>>I was very amazed then that he didn't like me doing that, because i had thought
>>>he just wanted his software to run faster. When you grow up you slowly learn how
>>>people work and in the AI world this is pretty easy to predict.
>>>So having that in mind i am sure that you are one of the first to publicly speak
>>>out and say that you can speedup things a lot!
>>>Last tuesday i was at a supercomputing conference and of course for hours i have
>>>talked with many researchers and professors. I am still very proud that against
>>>no one after talking what they did on the computer i told to that i would love
>>>to take a look at their code in order to give them a few free tips to speedup
>>>their software quite some times. With some of them i sure knew i could.
>>>Some still haven't found out the difference between latency and bandwidth and
>>>what is out of order (the R14k processors) and that the new itanium2 processors
>>>here (416 of them clocked 1.3Ghz and 832GB ram) which are way faster for
>>>floating point and way slower for latency than the old R14ks.
>>>Possible the slow latency is partly because of the interesting idea to run
>>>redhat 7.2 with the unmodified linux kernel 2.4.19 at it. Let's blame the
>>>economic times that causes this Dutch habit to save money :)
>>>A good example of several different research projects there which i can speedup
>>>with just 5 minutes of work is that several projects lose like all of their
>>>system time to a RNG as for each number in the calculation matrix they take a
>>>number from the RNG.
>>>They compile of course with option -O2.
>>>Their RNG is some slow thing that runs in 32 bits.
>>>However for my latency test i did a very small effort to speedup an already fast
>>>RNG a little. Replacing their RNG by this one would speedup their field
>>>calculations quite a lot.
>>>The matrix calculations they do then are pretty fast by the way as an efficient
>>>library is doing them for them.
>>>However they could also speed the entire program up incredibly by using 64 bits
>>>integer values instead of floating point.
>>>Remember both are 64 bit processors. Both the R14k (8MB L2 cache) and the
>>>I2-Madisons which are 3MB L2 cache.
>>>The research these guys do then still is very good research.
>>>No i won't mention a single name of the guys. They are cool guys.
>>>Best regards,
>>But I have seen some commercial ANN applications out there. Surely these have
>>optimized integer arithmetic, because there must be an economical incentive to
>>do so.
>Of course i didn't try all of them. Just a few. but the few i tried were not
>doing very well and dead slow in fact.
>Because just consider who buys such software, then you already know that
>features and logics used are for them more important than speed of the logics.
>I have to admit that my experiments i did it with dead slow code too, because
>the networks i used didn't have 10000 neurons like for example Dan Thies used to
>tune a chess evaluation (using a normal chess program and letting the ANN learn
>the evaluation including material values etc). For its time his ANN was very
>expensive and very fast (could do 10000 evaluations a seconds at the 10000
>neuron network).

That's 1 cycle per neuron on a 100MHz computer, or 10 cycles per neuron on a
1GHz computer.

>It still has to beat diep version 1.2 though (diep version 1.2 could at the
>tested hardware not put mate with KRK if i remember well).

Maybe using it for the evaluation is not the most efficient use of a neural
network in a chess program. It seems that the way human players manage to search
the tree is vastly underestimated.


This page took 0.03 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.