Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Participants 25th Open Dutch

Author: Gerd Isenberg

Date: 11:22:43 10/22/05

Go up one level in this thread


On October 22, 2005 at 11:50:07, Stan Arts wrote:

>Wow, looks like Neurosis will have to go in therapy after that is over.
>It is finally pondering since a few days (jippy!), searches one-two ply deeper
>and playing better then last year, but will probably end up with a worse result.
>:| Oh well, will be nice to meet all these people, and that there is such a full
>participants field from all over the world.
>

Yep - i nice jubilee field.
What happended to SpiderChess? Did Martin withdrawn?
Usurpator made some hidden version jumps and is version V now.
An intel-celeron machine emulating 6809?

>Why does it say IsiChess XMM instead of MMX now? eXtra Mean Machine? Or ehh,
>eXtreme Mating Moves,
>eXtraordinary Move Maker,
>eXtremely Mad Moves,
>X64bit Might Melt, or (considering your technical posts sometimes),
>Xorredultrafastmachineoptimisedflipflopfifostackerpopper Make Movefunction?
>
>Greetings
>Stan


Hehe - no it's a register set - you didn't know? ;-)

128-bit XMM-registers (eXtended Multi Media?).
These registers are used by the SSE(1/2/3) instructions set (SIMD (single
instruction multiple data) Streaming Extensions) - introduced with Intel P4
(iirc) with eight registers XMM0-XMM7.

SEE-instructions work on vectors of two doubles or four floats - several packed
integer instructions work on vectors of 16 bytes, 8 shorts, 4 ints or two long
longs. AMD has extended the registers to 16 in 64-bit mode - i am still 32-bit
and only 8 XMM-registers. I am back from MMX fill stuff to rotated bitboards
again, because on my current AMD64-box it is much faster than on my previous
athlons. I only do some fill stuff with XMM for pinned piece detections and
legal king move generation. And the 64-bit*64byte dot-product for weighted
attack counts in eval.

Cheers,
Gerd


int dotProductRotated(BitBoard bb, BYTE rorWeights[]) {
  static const BitBoard CACHE_ALIGN MaskConsts[8] = {
    0x0101010101010101,	0x0202020202020202,
    0x0404040404040404,	0x0808080808080808,
    0x1010101010101010,	0x2020202020202020,
    0x4040404040404040,	0x8080808080808080
  };
  __asm
  {
    movq        xmm0, [bb]
    lea         edx,  [MaskConsts]
    mov         eax,  [rorWeights]
    punpcklqdq  xmm0, xmm0
    pxor        xmm4, xmm4	; zero
    movdqa      xmm1, xmm0
    movdqa      xmm2, xmm0
    movdqa      xmm3, xmm0
    pandn       xmm0, [edx+0*16]
    pandn       xmm1, [edx+1*16]
    pandn       xmm2, [edx+2*16]
    pandn       xmm3, [edx+3*16]
    pcmpeqb     xmm0, xmm4
    pcmpeqb     xmm1, xmm4
    pcmpeqb     xmm2, xmm4
    pcmpeqb     xmm3, xmm4
    pand        xmm0, [eax+0*16] ; and with weights
    pand        xmm1, [eax+1*16]
    pand        xmm2, [eax+2*16]
    pand        xmm3, [eax+3*16]
    paddusb     xmm0, xmm1       ; add all bytes (with saturation)
    paddusb     xmm0, xmm2
    paddusb     xmm0, xmm3
    psadbw      xmm0, xmm4       ; horizontal add 2 * 8 byte
    pextrw      edx, xmm0, 4     ; extract both intermediate sums to gp
    pextrw      eax, xmm0, 0
    add         eax, edx         ; final add
  }
}



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.