Computer Chess Club Archives




Subject: Re: chess and neural networks

Author: Vincent Diepeveen

Date: 10:55:12 07/05/03

Go up one level in this thread

On July 05, 2003 at 00:25:24, Christophe Theron wrote:

>On July 04, 2003 at 23:56:34, Vincent Diepeveen wrote:
>>On July 04, 2003 at 11:32:03, Christophe Theron wrote:
>>>On July 03, 2003 at 15:44:44, Landon Rabern wrote:
>>>>On July 03, 2003 at 03:22:15, Christophe Theron wrote:
>>>>>On July 02, 2003 at 13:13:43, Landon Rabern wrote:
>>>>>>On July 02, 2003 at 02:18:48, Dann Corbit wrote:
>>>>>>>On July 02, 2003 at 02:03:20, Landon Rabern wrote:
>>>>>>>>I made an attempt to use a NN for determining extensions and reductions.  It was
>>>>>>>>evolved using a GA, kinda worked, but I ran out of time. to work on it at the
>>>>>>>>end of school and don't have my computer anymore. The problem is that the NN is
>>>>>>>>SLOW, even using x/(1+|x|) for activation instead of tanh(x).
>>>>>>>Precompute a hyperbolic tangent table and store it in an array.  Speeds it up a
>>>>>>Well, x/(1+|x|) is as fast or faster than a large table lookup.  The slowdown
>>>>>>was from all the looping necessary for the feedforward.
>>>>>A stupid question maybe, but I'm very interested by this stuff:
>>>>>Do you really need a lot of accuracy for the "activation function"? Would it be
>>>>>possible to consider a 256 values output for example?
>>>>>Would the lack of accuracy hurt?
>>>>>I'm not sure, but it seems to me that biological neurons do not need a lot of
>>>>>accuracy in their output, and even worse: they are noisy. So I wonder if low
>>>>>accuracy would be enough.
>>>>There are neural net models that work with only binary output.  If the total
>>>>input value exceeds some threshhold then you get a 1 otherwise a 0.  The problem
>>>>is with training them by back prop.  But in this case I was using a Genetic Alg,
>>>>so no back prop at all - so no problem.  I might work, but I don't see the
>>>>benefit - were you thinking for speed?  The x/(1+|x|) is pretty fast to
>>>>calculate, but perhaps the binary (or other discrete) would be faster.
>>>>Something to try.
>>>Yes, what I had in mind was optimization by using integer arithmetic only.
>>>If the output is always on 8 bits, the sigma(W*I) (weight*input) can be computed
>>>on 32 bits (each W*I will have at most 16 bits).
>>>Actually sigma(W*I) will have no more than 20 bits if each neuron has at most 16
>>>inputs. 32 bits allows for 65536 input per neuron.
>>>This -maybe- allows for a fast table lookup of the atan function that I see used
>>>often in ANN. I think it can be a little faster than x/(1+|x|) computed using
>>>floating point arithmetic. Also, and this is even more important, the sigma(W*I)
>>>would use integer arithmetic instead of floating point.
>>>Maybe I should just do a Google search for this, I'm sure I'm not the first one
>>>to think about this optimzation.
>>I'm actually sure you are the first to find this optimization!
>>The reason is that the average AI scientist never is doing many practical
>>experiments with ANNs. Basically practical researchers outside ANNs are doing
>>somteimes a few experiments like you and i. Further from that very tiny
>>percentile researchers that sometimes do ANN experiments the average solution
>>they come up with when they need to calculate it faster is either ask some
>>system time of a supercomputer, or more likely fill a sporthal of their own
>>university department with PC's, then they lay down a few network cables under
>>the floor and they run a few carefully selected benchmarks showing their beowulf
>>cluster really is a great thing to have.
>>When a year or 10 ago some dude was looking around for funding for either a
>>supercomputer or to buy hardware to speedup his neural networking software which
>>was using quite some neurons, then i have translated his quickbasic program into
>>C and optimized its speed by writing out stuff and finding clever loops within
>>it that lossless speeded it up.
>>In total i managed to speedup his software around a factor 1000 after 7 days of
>>hard work (remember i started with 100KB quickbasic code and ended up with about
>>20KB C code. note that the quickbasic used was the compilerversion not an
>>I was very amazed then that he didn't like me doing that, because i had thought
>>he just wanted his software to run faster. When you grow up you slowly learn how
>>people work and in the AI world this is pretty easy to predict.
>>So having that in mind i am sure that you are one of the first to publicly speak
>>out and say that you can speedup things a lot!
>>Last tuesday i was at a supercomputing conference and of course for hours i have
>>talked with many researchers and professors. I am still very proud that against
>>no one after talking what they did on the computer i told to that i would love
>>to take a look at their code in order to give them a few free tips to speedup
>>their software quite some times. With some of them i sure knew i could.
>>Some still haven't found out the difference between latency and bandwidth and
>>what is out of order (the R14k processors) and that the new itanium2 processors
>>here (416 of them clocked 1.3Ghz and 832GB ram) which are way faster for
>>floating point and way slower for latency than the old R14ks.
>>Possible the slow latency is partly because of the interesting idea to run
>>redhat 7.2 with the unmodified linux kernel 2.4.19 at it. Let's blame the
>>economic times that causes this Dutch habit to save money :)
>>A good example of several different research projects there which i can speedup
>>with just 5 minutes of work is that several projects lose like all of their
>>system time to a RNG as for each number in the calculation matrix they take a
>>number from the RNG.
>>They compile of course with option -O2.
>>Their RNG is some slow thing that runs in 32 bits.
>>However for my latency test i did a very small effort to speedup an already fast
>>RNG a little. Replacing their RNG by this one would speedup their field
>>calculations quite a lot.
>>The matrix calculations they do then are pretty fast by the way as an efficient
>>library is doing them for them.
>>However they could also speed the entire program up incredibly by using 64 bits
>>integer values instead of floating point.
>>Remember both are 64 bit processors. Both the R14k (8MB L2 cache) and the
>>I2-Madisons which are 3MB L2 cache.
>>The research these guys do then still is very good research.
>>No i won't mention a single name of the guys. They are cool guys.
>>Best regards,
>But I have seen some commercial ANN applications out there. Surely these have
>optimized integer arithmetic, because there must be an economical incentive to
>do so.

Of course i didn't try all of them. Just a few. but the few i tried were not
doing very well and dead slow in fact.

Because just consider who buys such software, then you already know that
features and logics used are for them more important than speed of the logics.

I have to admit that my experiments i did it with dead slow code too, because
the networks i used didn't have 10000 neurons like for example Dan Thies used to
tune a chess evaluation (using a normal chess program and letting the ANN learn
the evaluation including material values etc). For its time his ANN was very
expensive and very fast (could do 10000 evaluations a seconds at the 10000
neuron network).

It still has to beat diep version 1.2 though (diep version 1.2 could at the
tested hardware not put mate with KRK if i remember well).

>    Christophe

This page took 0.03 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.