Author: Aart J.C. Bik
Date: 14:06:17 01/13/05
Go up one level in this thread
>I guess your main focus with SSE/2/3 is more on float or double arithmetic
>rather than on integers, signed and unsigned chars, shorts, ints and even
>__int64 with rather unorthogonal instruction sets and a lot of special cases.
Hi Gerd,
Well, in the vectorizer’s defense, although the initial focus was indeed on
floating-point codes, also a lot of effort has been put into optimizing integer
codes. For instance, the following, not directly chess related, loop below
unsigned char x[100], y[100];
…
int sum = 0;
for (i = 0; i < 100; i++) {
int temp = x[i]-y[i];
if (temp < 0) temp = - temp;
sum += temp;
}
will automatically vectorize into code that exploits the “psadbw” idiom.
[C:/cmplr/temp] icl -nologo -Fa -QxP -Qunroll0 -c joho.c
joho.c
joho.c(14) : (col. 3) remark: LOOP WAS VECTORIZED.
[C:/cmplr/temp] cat joho.asm
....
L: movdqa xmm1, XMMWORD PTR _x[eax]
psadbw xmm1, XMMWORD PTR _y[eax]
paddd xmm0, xmm1
add eax, 16
cmp eax, 96
jb L
But you are absolutely right that compiler-generated code still has a long way
to go before it can get even close to the "crafty" implementations you have
shown me so far.
Thanks for the feedback.
Aart
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.