Author: Gerd Isenberg
Date: 11:22:43 10/22/05
Go up one level in this thread
On October 22, 2005 at 11:50:07, Stan Arts wrote:
>Wow, looks like Neurosis will have to go in therapy after that is over.
>It is finally pondering since a few days (jippy!), searches one-two ply deeper
>and playing better then last year, but will probably end up with a worse result.
>:| Oh well, will be nice to meet all these people, and that there is such a full
>participants field from all over the world.
>
Yep - i nice jubilee field.
What happended to SpiderChess? Did Martin withdrawn?
Usurpator made some hidden version jumps and is version V now.
An intel-celeron machine emulating 6809?
>Why does it say IsiChess XMM instead of MMX now? eXtra Mean Machine? Or ehh,
>eXtreme Mating Moves,
>eXtraordinary Move Maker,
>eXtremely Mad Moves,
>X64bit Might Melt, or (considering your technical posts sometimes),
>Xorredultrafastmachineoptimisedflipflopfifostackerpopper Make Movefunction?
>
>Greetings
>Stan
Hehe - no it's a register set - you didn't know? ;-)
128-bit XMM-registers (eXtended Multi Media?).
These registers are used by the SSE(1/2/3) instructions set (SIMD (single
instruction multiple data) Streaming Extensions) - introduced with Intel P4
(iirc) with eight registers XMM0-XMM7.
SEE-instructions work on vectors of two doubles or four floats - several packed
integer instructions work on vectors of 16 bytes, 8 shorts, 4 ints or two long
longs. AMD has extended the registers to 16 in 64-bit mode - i am still 32-bit
and only 8 XMM-registers. I am back from MMX fill stuff to rotated bitboards
again, because on my current AMD64-box it is much faster than on my previous
athlons. I only do some fill stuff with XMM for pinned piece detections and
legal king move generation. And the 64-bit*64byte dot-product for weighted
attack counts in eval.
Cheers,
Gerd
int dotProductRotated(BitBoard bb, BYTE rorWeights[]) {
static const BitBoard CACHE_ALIGN MaskConsts[8] = {
0x0101010101010101, 0x0202020202020202,
0x0404040404040404, 0x0808080808080808,
0x1010101010101010, 0x2020202020202020,
0x4040404040404040, 0x8080808080808080
};
__asm
{
movq xmm0, [bb]
lea edx, [MaskConsts]
mov eax, [rorWeights]
punpcklqdq xmm0, xmm0
pxor xmm4, xmm4 ; zero
movdqa xmm1, xmm0
movdqa xmm2, xmm0
movdqa xmm3, xmm0
pandn xmm0, [edx+0*16]
pandn xmm1, [edx+1*16]
pandn xmm2, [edx+2*16]
pandn xmm3, [edx+3*16]
pcmpeqb xmm0, xmm4
pcmpeqb xmm1, xmm4
pcmpeqb xmm2, xmm4
pcmpeqb xmm3, xmm4
pand xmm0, [eax+0*16] ; and with weights
pand xmm1, [eax+1*16]
pand xmm2, [eax+2*16]
pand xmm3, [eax+3*16]
paddusb xmm0, xmm1 ; add all bytes (with saturation)
paddusb xmm0, xmm2
paddusb xmm0, xmm3
psadbw xmm0, xmm4 ; horizontal add 2 * 8 byte
pextrw edx, xmm0, 4 ; extract both intermediate sums to gp
pextrw eax, xmm0, 0
add eax, edx ; final add
}
}
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.