Author: Gerd Isenberg
Date: 02:21:49 01/15/03
Go up one level in this thread
On January 14, 2003 at 19:23:38, Sander de Zoete wrote:
>Thanks Gerd,
>
>I also did some searching myself on the web. I ended up with a website where
>SIMD was explained together with MMX,SSE,MAJC, etc. Then I also managed to
>download the instruction sets from all these processor technologies. Can I use
>these like this in Cplusplus (Borland)
I don't know whether Borland Cplusplus supports inline assembly, and if, also
the MMX-instructions. With MSC one must install the processor pack, to use MMX
or P4s SSE2 instructions. What is MAJC?
>
>I really see a lot of potential here.
>
>// Put BITBOARD a into MMX
>void SetBitboardIntoMMX(BITBOARD a, unsigned int b)
>{
> asm
> {
> movq a
>
> }
> // Shift a with value b to the right
> asm
> {
> psrlw a, b
>
> }
>}
>Will something like this work?
The intel asm-syntax is like
<opcode> destination operand, source operand
<opcode> unary destination operand
like
mov eax, 1 ; load immediate constant to eax
xor ecx, ecx ; clears ecx
neg edx ; edx := -edx
most arithmetical or logical x86 instructions sets some flags (Carry = overflow,
Sign, Zero) where you can do some conditional jumps with:
test eax, edx ; like "and" without storing the result
jz label ; jump, if no bit is set
MMX-(and SSE, SSE2 and 3DNow) Instructions are somehow different, they don't set
any flags. There is no load immediate constant. The whole MMX-instruction set is
not so stringent.
The instructions work on multiple bytes (therefore Single Instruction Multiple
Data SIMD), 16-bit words, 32-bit double words and 64-bit quad word. Shift (eg.
shift logical left: psllw, pslld, psllq) is supported for words, dwords and
qwords but not for bytes. Add and sub (paddb, paddw, paddd, psubb, psubw, psubd)
is supported for bytes, words and dwords, but no 64-arithmetic with qwords.
void SetBitboardIntoMMX(BITBOARD a, unsigned int b)
{
asm
{
movq mm5, [a] ; load a into mm5
movd mm3, [b] ; load b into mm3 with zero extension
psrlw mm5, mm3 ; a>>b
}
}
The "movq mm5, [a]" becomes a huge penalty, if "a" is not properly bitboard
aligned. Because bitboard "a" pushed on the stack is not necessarily aligned, it
is recommend to use two load instructions instead of one "movq".
void SetBitboardIntoMMX(BITBOARD a, unsigned int b)
{
asm
{
movd mm5, [a] ; load low dword a into mm5
punpckldq mm5, [a+4] ; load high dword a into mm5
movd mm3, [b] ; load b into mm3 with zero extension
psrlw mm5, mm3 ; a>>b
}
}
the [a] memory source operand here is really a base or stack pointer relative
memory access, like "movd mm5,[ebp+offset]". But you dont't have to care about
that, if you simply write [a]. But if a is a global variable it is "mov
reg,[address of var]".
>
>In an example a was represented by mm5 (probably the 6th memory space in MMX
>(mm0 - mm7 if I recall correctly). How do I know what is in mm5?
>
Yes, eight mmx-registers mm0-mm7. Ok you have to initialize a register via movd,
movq. To load a zero one should use pxor with same source and destination
register. To load 0xff..ff (-1) one should use pcmpeqd (parallel compare equal
double) with same source and destination register:
// input: mm1 pawnBB
// output: mm0 pawnAttacks
//=========================
void getBlackPawnAttacksMMX()
{
__asm
{
pxor mm4, mm4 ; 0x0000000000000000
pcmpeqd mm7, mm7 ; 0xffffffffffffffff
movq mm0, mm1 ; left
psubb mm4, mm7 ; 0x0101010101010101
psubb mm7, mm4 ; 0xfefefefefefefefe notA
paddb mm1, mm1 ; right
pand mm0, mm7 ; clear left a-file before shift
psrlq mm0, 1 ; left
por mm0, mm1 ; left|right
psrlq mm0, 8 ; left|right -> down
}
}
Gerd
>Sander.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.