# Computer Chess Club Archives

## Messages

### Subject: Re: chess and neural networks

Author: Uri Blass

Date: 21:25:49 07/05/03

Go up one level in this thread

```On July 05, 2003 at 22:58:51, Christophe Theron wrote:

>On July 05, 2003 at 13:55:12, Vincent Diepeveen wrote:
>
>>On July 05, 2003 at 00:25:24, Christophe Theron wrote:
>>
>>>On July 04, 2003 at 23:56:34, Vincent Diepeveen wrote:
>>>
>>>>On July 04, 2003 at 11:32:03, Christophe Theron wrote:
>>>>
>>>>>On July 03, 2003 at 15:44:44, Landon Rabern wrote:
>>>>>
>>>>>>On July 03, 2003 at 03:22:15, Christophe Theron wrote:
>>>>>>
>>>>>>>On July 02, 2003 at 13:13:43, Landon Rabern wrote:
>>>>>>>
>>>>>>>>On July 02, 2003 at 02:18:48, Dann Corbit wrote:
>>>>>>>>
>>>>>>>>>On July 02, 2003 at 02:03:20, Landon Rabern wrote:
>>>>>>>>>[snip]
>>>>>>>>>>I made an attempt to use a NN for determining extensions and reductions.  It was
>>>>>>>>>>evolved using a GA, kinda worked, but I ran out of time. to work on it at the
>>>>>>>>>>end of school and don't have my computer anymore. The problem is that the NN is
>>>>>>>>>>SLOW, even using x/(1+|x|) for activation instead of tanh(x).
>>>>>>>>>
>>>>>>>>>Precompute a hyperbolic tangent table and store it in an array.  Speeds it up a
>>>>>>>>>lot.
>>>>>>>>
>>>>>>>>Well, x/(1+|x|) is as fast or faster than a large table lookup.  The slowdown
>>>>>>>>was from all the looping necessary for the feedforward.
>>>>>>>>
>>>>>>>>Landon
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>A stupid question maybe, but I'm very interested by this stuff:
>>>>>>>
>>>>>>>Do you really need a lot of accuracy for the "activation function"? Would it be
>>>>>>>possible to consider a 256 values output for example?
>>>>>>>
>>>>>>>Would the lack of accuracy hurt?
>>>>>>>
>>>>>>>I'm not sure, but it seems to me that biological neurons do not need a lot of
>>>>>>>accuracy in their output, and even worse: they are noisy. So I wonder if low
>>>>>>>accuracy would be enough.
>>>>>>>
>>>>>>
>>>>>>There are neural net models that work with only binary output.  If the total
>>>>>>input value exceeds some threshhold then you get a 1 otherwise a 0.  The problem
>>>>>>is with training them by back prop.  But in this case I was using a Genetic Alg,
>>>>>>so no back prop at all - so no problem.  I might work, but I don't see the
>>>>>>benefit - were you thinking for speed?  The x/(1+|x|) is pretty fast to
>>>>>>calculate, but perhaps the binary (or other discrete) would be faster.
>>>>>>Something to try.
>>>>>>
>>>>>>Landon
>>>>>
>>>>>
>>>>>
>>>>>Yes, what I had in mind was optimization by using integer arithmetic only.
>>>>>
>>>>>If the output is always on 8 bits, the sigma(W*I) (weight*input) can be computed
>>>>>on 32 bits (each W*I will have at most 16 bits).
>>>>>
>>>>>Actually sigma(W*I) will have no more than 20 bits if each neuron has at most 16
>>>>>inputs. 32 bits allows for 65536 input per neuron.
>>>>>
>>>>>This -maybe- allows for a fast table lookup of the atan function that I see used
>>>>>often in ANN. I think it can be a little faster than x/(1+|x|) computed using
>>>>>floating point arithmetic. Also, and this is even more important, the sigma(W*I)
>>>>>would use integer arithmetic instead of floating point.
>>>>>
>>>>>Maybe I should just do a Google search for this, I'm sure I'm not the first one
>>>>
>>>>I'm actually sure you are the first to find this optimization!
>>>>
>>>>The reason is that the average AI scientist never is doing many practical
>>>>experiments with ANNs. Basically practical researchers outside ANNs are doing
>>>>somteimes a few experiments like you and i. Further from that very tiny
>>>>percentile researchers that sometimes do ANN experiments the average solution
>>>>they come up with when they need to calculate it faster is either ask some
>>>>system time of a supercomputer, or more likely fill a sporthal of their own
>>>>university department with PC's, then they lay down a few network cables under
>>>>the floor and they run a few carefully selected benchmarks showing their beowulf
>>>>cluster really is a great thing to have.
>>>>
>>>>When a year or 10 ago some dude was looking around for funding for either a
>>>>supercomputer or to buy hardware to speedup his neural networking software which
>>>>was using quite some neurons, then i have translated his quickbasic program into
>>>>C and optimized its speed by writing out stuff and finding clever loops within
>>>>it that lossless speeded it up.
>>>>
>>>>In total i managed to speedup his software around a factor 1000 after 7 days of
>>>>hard work (remember i started with 100KB quickbasic code and ended up with about
>>>>20KB C code. note that the quickbasic used was the compilerversion not an
>>>>interpreter).
>>>>
>>>>I was very amazed then that he didn't like me doing that, because i had thought
>>>>he just wanted his software to run faster. When you grow up you slowly learn how
>>>>people work and in the AI world this is pretty easy to predict.
>>>>
>>>>So having that in mind i am sure that you are one of the first to publicly speak
>>>>out and say that you can speedup things a lot!
>>>>
>>>>Last tuesday i was at a supercomputing conference and of course for hours i have
>>>>talked with many researchers and professors. I am still very proud that against
>>>>no one after talking what they did on the computer i told to that i would love
>>>>to take a look at their code in order to give them a few free tips to speedup
>>>>their software quite some times. With some of them i sure knew i could.
>>>>
>>>>Some still haven't found out the difference between latency and bandwidth and
>>>>what is out of order (the R14k processors) and that the new itanium2 processors
>>>>here (416 of them clocked 1.3Ghz and 832GB ram) which are way faster for
>>>>floating point and way slower for latency than the old R14ks.
>>>>
>>>>Possible the slow latency is partly because of the interesting idea to run
>>>>redhat 7.2 with the unmodified linux kernel 2.4.19 at it. Let's blame the
>>>>economic times that causes this Dutch habit to save money :)
>>>>
>>>>A good example of several different research projects there which i can speedup
>>>>with just 5 minutes of work is that several projects lose like all of their
>>>>system time to a RNG as for each number in the calculation matrix they take a
>>>>number from the RNG.
>>>>
>>>>They compile of course with option -O2.
>>>>
>>>>Their RNG is some slow thing that runs in 32 bits.
>>>>
>>>>However for my latency test i did a very small effort to speedup an already fast
>>>>RNG a little. Replacing their RNG by this one would speedup their field
>>>>calculations quite a lot.
>>>>
>>>>The matrix calculations they do then are pretty fast by the way as an efficient
>>>>library is doing them for them.
>>>>
>>>>However they could also speed the entire program up incredibly by using 64 bits
>>>>integer values instead of floating point.
>>>>
>>>>Remember both are 64 bit processors. Both the R14k (8MB L2 cache) and the
>>>>I2-Madisons which are 3MB L2 cache.
>>>>
>>>>The research these guys do then still is very good research.
>>>>
>>>>No i won't mention a single name of the guys. They are cool guys.
>>>>
>>>>Best regards,
>>>>Vincent
>>>
>>>
>>>
>>>But I have seen some commercial ANN applications out there. Surely these have
>>>optimized integer arithmetic, because there must be an economical incentive to
>>>do so.
>>>
>>
>>Of course i didn't try all of them. Just a few. but the few i tried were not
>>doing very well and dead slow in fact.
>>
>>Because just consider who buys such software, then you already know that
>>features and logics used are for them more important than speed of the logics.
>>
>>I have to admit that my experiments i did it with dead slow code too, because
>>the networks i used didn't have 10000 neurons like for example Dan Thies used to
>>tune a chess evaluation (using a normal chess program and letting the ANN learn
>>the evaluation including material values etc). For its time his ANN was very
>>expensive and very fast (could do 10000 evaluations a seconds at the 10000
>>neuron network).
>
>
>That's 1 cycle per neuron on a 100MHz computer, or 10 cycles per neuron on a
>1GHz computer.
>
>
>
>>It still has to beat diep version 1.2 though (diep version 1.2 could at the
>>tested hardware not put mate with KRK if i remember well).
>
>
>Maybe using it for the evaluation is not the most efficient use of a neural
>network in a chess program. It seems that the way human players manage to search
>the tree is vastly underestimated.
>
>
>
>    Christophe

I agree with you that search is underestimated in chess but I also believe
that search and evaluation are connected because a lot of search decisions are
based on evaluation of positions that are not leaf positions so you cannot
seperate them and say search improvement gives x elo and evaluation improvement
gives y elo.

Uri

```