Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: SSE2 bit[64] * byte[64] dot product

Author: Anthony Cozzie

Date: 09:31:59 07/20/04

What do you think of the following C code:

int bb_dot_product(bitboard a, unsigned char *weights)
{
   bitboard t, t1, *_weights = weights;
   static bitboard table[256] = {correct translations, e.g. 0xFF -> 0xffffffff}

   //we count on the compiler to unroll this loop.
   for(i = 0; i < 8; i++, a ) {
      t = table[(a >> i*8) & 0xFF] & weights[i];
      t1 = t;
      t << 8;
      sum += (t & 0x00FF00FF00FF00FF) + (t1 & 0x00FF00FF00FF00FF);
   }

   sum = (sum & 0x0000FFFF0000FFFF) + ((sum >> 16) & 0x0000FFFF0000FFFF);
   sum = ((sum >> 32) + sum & 0x00000000FFFFFFFF);
   return sum;
}

It has several advantages:  Can use full 0-255 for each weight, the table does
not have to be rotated, and there is no penalty for moving between the integer
and MMX pipes.

OTOH, this solution is also much less cache friendly, requiring maybe 2x the
number of instructions and also needed 2KB of data cache.

anthony

Re: SSE2 bit[64] * byte[64] dot product Gerd Isenberg 10:54:12 07/21/04
- ignore previous post Gerd Isenberg 11:05:33 07/21/04
Re: SSE2 bit[64] * byte[64] dot product Anthony Cozzie 09:39:17 07/20/04
- Re: SSE2 bit[64] * byte[64] dot product Gerd Isenberg 02:16:00 07/21/04
  - Re: SSE2 bit[64] * byte[64] dot product Anthony Cozzie 06:41:48 07/21/04

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.