Author: Gerd Isenberg
Date: 03:44:56 05/22/05
Hi all,
Subtracting simd-wise is the fastest approach to get attack sets in one
horicontal direction from a set of rooks or queens:
occu ::= occupied set
rooks ::= rooks/queens subset of occu
rightAttacks ::= occu ^ (occu - 2*rooks)
or
rightAttacks ::= occu ^ ((occu & ~rooks) - rooks)
The first subtract "clears" the rook bits in occupied (rooks must be subset of
occupied!). The second subtract "generates" the rook attacks by propagating
borrow bits, while the final xor filters the changed bits - exactly all attacks
of all rooks for this direction. Four sse2 or mmx instructions:
movdqa xmm0, xmm2 ; rooks
pandn xmm0, xmm3 ; ~rooks & occu
psubb xmm0, xmm2 ; (~rooks & occu) - rooks
pxor xmm0, xmm3 ;((~rooks & occu) - rooks) ^ occu
I like to emulate the 64-bit subtract only with bitwise- and shift-instructions
- to deduce Kogge-Stone fill-routines to propagate sliding attacks rather than
carry bits.
As Steffan Westcott already mentioned while introducing
parallel prefix Kogge Stone routines for sliding attack generation:
http://chessprogramming.org/cccsearch/ccc.php?art_id=261956
"The other method is inspired by parallel prefix circuits, a hardware design for
fast carry propagation in adders. The Kogge-Stone method lends itself to
convenient implementation in software, here used to propagate sliding chess
pieces rather than carry bits."
--------------------------
Add and carry propagation is done by "xor" and "and". The carries are shifted in
the desired direction and then iterative added again:
int add (int x, int y) {
p = x ^ y; // sum by xor
g = x & y; // carry by and
while ( g ) { // more carries?
g <<= 1; // yes, shift carries left
x = p ^ g; // sum by xor
g = p & g; // carry by and
p = x;
}
return p;
}
to compute the all carries in parallel prefix manner:
g = x & y; // overflows
p = x ^ y; // sums
c[i+i] = g[i] | p[i]&c[i]
c1 = g0
c2 = g1 | c1&p1 = g1 | g0&p1
c3 = g2 | c2&p2 = g2 | g1&p2 | g0&p2&p1
c4 = g3 | c3&p3 = g3 | g2&p3 | g1&p3&p2 | g0&p3&p2&p1
...
The routine for bytewise parallel add looks allready similar to Steffan
Westcott's Kogge-Stone fillroutines:
BitBoard paddb (BitBoard x, BitBoard y) {
BitBoard g, p, su;
g = ( x & y);
p = ( x ^ y) & 0xfefefefefefefefe;
g |= (g << 1) & p;
p = (p << 1) & p;
g |= (g << 2) & p;
p = (p << 2) & p;
g |= (g << 4) & p; // the final carries
su =((g << 1) & 0xfefefefefefefefe) ^ (x ^ y);
return su;
}
I thought about a parallel sub without using negate of course.
Since negate is twos-complement and requires ones-complement and
a further add one. Is that really so simple - only complement x while
initializing g and p?
BitBoard psubb(BitBoard x, BitBoard y) {
BitBoard g, p, df;
g = (~x & y);
p = (~x ^ y) & 0xfefefefefefefefe;
g |= (g << 1) & p;
p = (p << 1) & p;
g |= (g << 2) & p;
p = (p << 2) & p;
g |= (g << 4) & p;
df =((g << 1) & 0xfefefefefefefefe) ^ (x ^ y);
return df; // x - y
}
After some substitutions the final fill-routine - q.e.d.
BitBoard rightAttacks(BitBoard free, BitBoard rooks)
{
BitBoard g, p, df;
g = rooks;
p = free & 0xfefefefefefefefe; // notA
g |= (g << 1) & p;
p = (p << 1) & p;
g |= (g << 2) & p;
p = (p << 2) & p;
g |= (g << 4) & p;
df =((g << 1) & 0xfefefefefefefefe); // notA
return df;
}
In opposition to carry propagation all eight fill directions
of sliding pieces may be propagated.
Cheers,
Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.