Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Optimizing C code for speed

Author: Dieter Buerssner

Date: 13:52:12 01/02/03

Go up one level in this thread


On January 02, 2003 at 16:34:02, Matt Taylor wrote:

>Consider a snippet of code which does integer vector addition (similar to
>example pointed out earlier):
>
>int add_them(int *dest, int *src1, int *src2, int len)
>{
>    int i;
>
>    for(i = 0; i < len; i++)
>        *dest++ = *src1++ + *src2++;
>}

I think, I agree with everything you said. Some more points. The coder of the
above snippet obviously wanted to do it rather clever, but still missed some
points. First of all (not really related to the topic), probably int i should be
size_t i (after including stdlib.h or something else, that defines size_t).

Then, probably on many CPU/compiler combinations, probably in this case running
the loop backwards would be faster:

int add_them(int *dest, int *src1, int *src2, size_t len)
{
    size_t i;

    for(i = len; i != 0; i--)
        *dest++ = *src1++ + *src2++;
}

Also, most probably this function will never be called with len 0. Then a do
while loop would be appropriate:

int add_them(int *dest, int *src1, int *src2, size_t len)
{
    /* Perhaps, for some compiler, using a local variable for len may help */
    do
    {
        *dest++ = *src1++ + *src2++;
    } while (--len != 0);
}



>This could be more efficient on x86 (and much more readable IMO) written as
>follows:

Yes. Also, in this case, loop unrolling by the compiler will probably help. It
is possible, that the compiler can see this better, with the most obvious
implementation (which I snipped).

Because Duff's device was mentioned. In my chess engine, I cannot think of a
good place, where it would be useful. Sure, for some initialization stuff, it
would produce faster code. But in the real inner-loops ... ?

Perhaps one reason, that for me the high optimizations (-O3, etc.) do not help.
Actually, they typically make my code (not only the chess engine) slower. Also,
one may ask - why don't compilers default for such "good" optimization? I think,
because the compiler writers know, that it is not allways that good.

Regards,
Dieter



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.