Author: Dieter Buerssner
Date: 12:06:14 12/17/03
Go up one level in this thread
On December 17, 2003 at 14:51:38, Anthony Cozzie wrote: >suppose we make a slight change: > >void vectoradd(double *a, double *b, double *c, int len) >{ > for(int i = 0; i < len; i++) > c[i+1] = b[i] + a[i]; >} > >now the compiler cannot unroll the loop unless it knows that there is no >aliasing. Hmmm, I think the compiler can still unroll the loop. For example like this (untested, but you get the idea): void vectoradd(double *a, double *b, double *c, int len) { int l2 = len/2; for (i=0; i<l2; i+=2) { c[i+1] = b[i] + a[i]; c[i+2] = b[i+1] + a[i+1]; } /* And now handle odd numbers ... */ } What the compiler cannot do (in your first and in your second source snippet) for example: He cannot load a[i] and a[i+1] into two registers, same for b, and then do two adds with the four registers. It could do this when the pointers were restrict. A better example may be, void copy_pv(pv_t *dest, pv_t *src, size_t n) { int i; for (i=0; i<n; i++) dest[i] = src[i]; } Now assume pv_t is unsigned char, and n is long. An obvious optimization would be to copy the elements word wise (for example 4 chars on typical hardware). The compiler cannot do this optimization. You might call copy_pv(pv_array, pv_array+1, m); and that would go wrong. So this optimization must not be done. However, when the pointers are restricted (and then, you never can call copy_pv like above), the compiler can do this optimization. For example it will be secure, when you typically only use calls like copy_pv(pv1_array, pv2_array, m); where pv1_array and pv2_array are different objects (no overlap). Regards, Dieter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.