Author: Dieter Buerssner
Date: 12:06:14 12/17/03
Go up one level in this thread
On December 17, 2003 at 14:51:38, Anthony Cozzie wrote:
>suppose we make a slight change:
>
>void vectoradd(double *a, double *b, double *c, int len)
>{
> for(int i = 0; i < len; i++)
> c[i+1] = b[i] + a[i];
>}
>
>now the compiler cannot unroll the loop unless it knows that there is no
>aliasing.
Hmmm, I think the compiler can still unroll the loop. For example like this
(untested, but you get the idea):
void vectoradd(double *a, double *b, double *c, int len)
{
int l2 = len/2;
for (i=0; i<l2; i+=2)
{
c[i+1] = b[i] + a[i];
c[i+2] = b[i+1] + a[i+1];
}
/* And now handle odd numbers ... */
}
What the compiler cannot do (in your first and in your second source snippet)
for example: He cannot load a[i] and a[i+1] into two registers, same for b, and
then do two adds with the four registers. It could do this when the pointers
were restrict.
A better example may be,
void copy_pv(pv_t *dest, pv_t *src, size_t n)
{
int i;
for (i=0; i<n; i++)
dest[i] = src[i];
}
Now assume pv_t is unsigned char, and n is long. An obvious optimization would
be to copy the elements word wise (for example 4 chars on typical hardware). The
compiler cannot do this optimization. You might call
copy_pv(pv_array, pv_array+1, m);
and that would go wrong. So this optimization must not be done. However, when
the pointers are restricted (and then, you never can call copy_pv like above),
the compiler can do this optimization. For example it will be secure, when you
typically only use calls like
copy_pv(pv1_array, pv2_array, m);
where pv1_array and pv2_array are different objects (no overlap).
Regards,
Dieter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.