Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSE2-Intrinsic Wrapper - further comments

Author: Gerd Isenberg

Date: 10:49:39 10/08/03

Go up one level in this thread


Note that the binary add and sub operators work bytewise:

	friend CDblBB operator+(const CDblBB &a, const CDblBB &b) {
		return CDblBB(_mm_add_epi8(a.dbl, b.dbl));}
	friend CDblBB operator-(const CDblBB &a, const CDblBB &b) {
		return CDblBB(_mm_sub_epi8(a.dbl, b.dbl));}

Probably not a good idea, only to get right attacks with the rankwise
x ^ (x-2) trick. For bit extraction via x & -x one may overload unary minus with
64-bit arithmetic (not tested):

	friend CDblBB operator-(const CDblBB &a)
	{
		return CDblBB(_mm_sub_epi64(_mm_setzero_si128(), a.dbl));}
	}

Unfortunately there is no intrinsic setter like the "pxor" _mm_setzero_si128(),
to set all bits in a xmm-register, via PCMPEQD xmm,xmm. Otherwise -1, notA and
notH masks may cheaper calculated via register on the fly.

The intrinsics
__m128i tmp;
tmp = _mm_cmpeq_epi32(tmp,tmp)

is translated to e.g.

  movadq   xmm1, xmm0
  pcmpeqd  xmm0, xmm1

with a possible register stall, instead of the direct

  pcmpeqd  xmm0, xmm0

But that is all with MSVC6.

Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.