Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Fast 3DNow! BitScan, one more faster

Author: Gerd Isenberg

Date: 04:30:36 12/02/02

Go up one level in this thread


Possible capture-generation application, with Floodfill based (mmx) attack
generation:

 ...
 BitBoard rooks = OwnRooks() & someBookkeepingMask;
 while (rooks)
 {
	BitBoard rook = rooks & -rooks; rooks ^= rook; // get and reset lsb
	BitBoard rookAttacks = getRookAttacks(rook);
	BitBoard attackedQueens = rookAttacks & EnemyQueens() & ...;
	while (attackedQueens)
	{
		BitBoard queen = attackedQueens & -attackedQueens;
		attackedQueens ^= queen; // reset lsb
		PushCapture (RookTakesQueen, getMoveIndex(rook,queen));
	}
	BitBoard attackedRooks = rookAttacks & EnemyRooks() & ...;
        ...
 }
 ...


This safes the bitscan at all, if the targetsets are empty (There is not always
a winning rook capture). With hammer it is possible to do the whole nested
generation loops with 64-bit general purpose-registers.

The possible effort of multiple BitScans of the from Bitboard is negligible, due
to the parallel processing in the following routine:


int getMoveIndex(BitBoard fromBB, BitBoard toBB)
{
	__asm
	{
		pxor	mm1, mm1	; 0
		pxor	mm3, mm3	; 0
		movd		mm0, [fromBB]
		punpckldq	mm0, [fromBBt+4]
		movd		mm0, [toBB]
		punpckldq	mm0, [toBB+4]
		pcmpeqd	mm6, mm6	; -1
		pxor	mm7, mm7	; 0
		pcmpeqd	mm1, mm0	; ~mask of the none zero dword
		PI2FD	mm0, mm0	; 3f8..,400..
		pcmpeqd	mm3, mm2	; ~mask of the none zero dword
		PI2FD	mm2, mm2	; 3f8..,400..
		pxor	mm1, mm6	; mask of the none zero dword
		pxor	mm3, mm6	; mask of the none zero dword
		psrlq	mm6, 63		; 01
		psrld	mm0, 23		; 3f8 to 7f
		psrld	mm2, 23		; 3f8 to 7f
		psrld	mm1, 25		; 7f mask
		psrld	mm3, 25		; 7f mask
		psllq	mm6, 32+5	; 20:00
		psubd	mm0, mm1	; - 7f mask
		psubd	mm2, mm3	; - 7f mask
		por	mm0, mm6	; + 32 in high dword
		por	mm2, mm6	; + 32 in high dword
		pand	mm0, mm1	; & 7f mask
		pand	mm2, mm3	; & 7f mask
		psadbw	mm0, mm7	; add all bytes
		psadbw	mm2, mm7	; add all bytes
                psllq   mm0, 6          ; from*64
		por	mm0, mm2	; from*64 + to
		movd	eax, mm0
	}
}

Cheers,
Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.