Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: planning a SSE-optimized chess engine

Author: Gerd Isenberg

Date: 13:23:30 01/13/05

On January 13, 2005 at 14:40:50, Aart J.C. Bik wrote:

>
>The following attempt for the 64-bit version will vectorize, but I see no
>speedup over the sequential compilation of the same implementation (it is faster
>than the original source code with shift, however):
>
>unsigned int bits32[32];  /* precompute shifts */
>
>int dotProduct64(unsigned __int64 bb, unsigned char weight[])
>{
> int i;
> int sum = 0;
> unsigned int b1 = bb;
> unsigned int b2 = bb>>32;
>#pragma ivdep
>#pragma vector aligned
> for (i=0; i < 32; i++) {
>    if (b1 & bits32[i]) sum += weight[i];
>    if (b2 & bits32[i]) sum += weight[i+32];
> }
> return sum;
>}


Yes - i guess you are ambitious on 128-bit alus as well.
38 amd64 cycles to beat ;-)

Gerd

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.