Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: 64Bit optimize coding - My experience (AMD64)

Author: Gerd Isenberg

Date: 11:22:56 12/21/05

Go up one level in this thread


another appraoch with type and int templates:

template <class T, unsigned int nint>
void andIntVector(const int* i1, const int* i2, int* i3)
{
  const T* t1 = (const T*)i1;
  const T* t2 = (const T*)i2;
  T* t3 = (T*)i3;
  for (unsigned int i=0; i < (nint * sizeof(int))/sizeof(T); i+= 4) {
    t3[i+0] = t1[i+0] & t2[i+0];
    t3[i+1] = t1[i+1] & t2[i+1];
    t3[i+2] = t1[i+2] & t2[i+2];
    t3[i+3] = t1[i+3] & t2[i+3];
  }
}

and to use

#define XMM_ALIGN __declspec(align(16))

int XMM_ALIGN a1[2048];
int XMM_ALIGN a2[2048];
int XMM_ALIGN a3[2048];

and to use either

  andIntVector<int,2048>(a1, a2, a3);
  andIntVector<long long,2048>(a1, a2, a3);

with some sse2 as longlong[2] wrapper
http://hornid.com/cgi-bin/ccc/topic_show.pl?pid=306573;hl=SSE2#pid306573

  andIntVector<XMM,2048>(a1, a2, a3);

for the other extreme, you may even try ;-)

  andIntVector<short,2048>(a1, a2, a3);
  andIntVector<char,2048>(a1, a2, a3);

Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.