Author: Aart J.C. Bik
Date: 11:17:01 01/13/05
Go up one level in this thread
Hi Gerd, Thanks for your insights! Well, vectorization in the Intel compiler is my specialty :-). If you want to quickly learn more about all switches and pragmas related to vectorization please refer to the online IDS article at http://www.intel.com/cd/ids/developer/asmo-na/eng/65774.htm. If you are interested in much more details, please also allow me to promote my book on this subject: The Software Vectorization Handbook. Intel Press, June 2004. http://www.intel.com/intelpress/sum_vmmx.htm Having said that it would be nice if I could show straightforward vectorization of your code. Alas, things are not that simple (and I hope to get new insights in this forum). Let’s start with a slight simplification (pre-compute the shift factors and use a 32-bit bitboard): unsigned int bits32[64]; /* precomputed shifts */ int dotProduct32(unsigned int bb, unsigned char weight[]) { int i; unsigned int sum = 0; #pragma vector aligned /* <- used assuming weight is 16-byte aligned */ for (i=0; i < 32; i++) { if (bb & bits32[i]) sum += weight[i]; } return sum; } This will vectorize using the Intel compiler (also note that your “hint” on masking the reduction is not required): [C:/temp] icl –Fa –Qunroll0 -nologo -QxP -c dot32.c dot32.c dot32.c(10) : (col. 2) remark: LOOP WAS VECTORIZED. In its “rerolled” form (for simplicity I used –Qunroll0), the generated code looks like: <setup> L: movdqa xmm4, XMMWORD PTR _bits32[0+eax*4] pand xmm4, xmm0 pcmpeqd xmm4, xmm1 movd xmm3, DWORD PTR [eax+edx] punpcklbw xmm3, xmm1 punpcklwd xmm3, xmm1 add eax, 4 cmp eax, 32 pandn xmm4, xmm3 paddd xmm2, xmm4 jb L <compute partial sums> Your 64-bit version gives my vectorizer more headaches however. Let me ponder about this some more to see what can be improved in the Intel compiler (looks like I am back at my job rather than focusing on the chess engine :-). Sincerely, Aart Bik http://www.aartbik.com/
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.