Computer Chess Club Archives




Subject: Re: SSE2 bit[64] * byte[64] dot product

Author: Daniel Clausen

Date: 05:49:32 07/22/04

Go up one level in this thread

On July 22, 2004 at 08:38:48, Gerd Isenberg wrote:


>Another similar shuffling sequence, one instruction less and therefore slightly
>faster with appropriate scheduling of independent initialization instructions.
>It may be used with passed bitboard already in xmm0:
>  movq       xmm0, [bb]  ; 0x0000000000000000:0xf0e1d2c3b4a59687
>  punpcklbw  xmm0, xmm0  ; 0xf0f0e1e1d2d2c3c3:0xb4b4a5a596968787
>  movdqa     xmm2, xmm0
>  punpcklwd  xmm0, xmm0  ; 0xb4b4b4b4a5a5a5a5:0x9696969687878787
>  punpckhwd  xmm2, xmm2  ; 0xf0f0f0f0e1e1e1e1:0xd2d2d2d2c3c3c3c3
>  movdqa     xmm1, xmm0
>  movdqa     xmm3, xmm2
>  punpckldq  xmm0, xmm0  ; 0x9696969696969696:0x8787878787878787
>  punpckhdq  xmm1, xmm1  ; 0xb4b4b4b4b4b4b4b4:0xa5a5a5a5a5a5a5a5
>  punpckldq  xmm2, xmm2  ; 0xd2d2d2d2d2d2d2d2:0xc3c3c3c3c3c3c3c3
>  punpckhdq  xmm3, xmm3  ; 0xf0f0f0f0f0f0f0f0:0xe1e1e1e1e1e1e1e1
>The Unpack and Interleave instructions became familar now ;-)
>Anyway i'm not quite sure whether to take this additional overhead or to stay
>with the rotated weights.
>  movq       xmm0, [bb]  ; 0x0000000000000000:0xf0e1d2c3b4a59687
>  punpcklqdq xmm0, xmm0  ; 0xf0e1d2c3b4a59687:0xf0e1d2c3b4a59687
>  movdqa     xmm1, xmm0  ; 0xf0e1d2c3b4a59687:0xf0e1d2c3b4a59687
>  movdqa     xmm2, xmm0  ; 0xf0e1d2c3b4a59687:0xf0e1d2c3b4a59687
>  movdqa     xmm3, xmm0  ; 0xf0e1d2c3b4a59687:0xf0e1d2c3b4a59687
>Rotating a square index here and there is not that expensive...
>I plan most weights already precomputed and probably indexed by some
>(king)squares and side.

I just wanted to remind you guys that the goal of the game is to checkmate your
opponent. With all these hex-numbers and asm instructions it's easy to lose
track. ;)


This page took 0.01 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.