Author: Aart J.C. Bik
Date: 11:40:50 01/13/05
Go up one level in this thread
The following attempt for the 64-bit version will vectorize, but I see no
speedup over the sequential compilation of the same implementation (it is faster
than the original source code with shift, however):
unsigned int bits32[32]; /* precompute shifts */
int dotProduct64(unsigned __int64 bb, unsigned char weight[])
{
int i;
int sum = 0;
unsigned int b1 = bb;
unsigned int b2 = bb>>32;
#pragma ivdep
#pragma vector aligned
for (i=0; i < 32; i++) {
if (b1 & bits32[i]) sum += weight[i];
if (b2 & bits32[i]) sum += weight[i+32];
}
return sum;
}
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.