Author: Gerd Isenberg
Date: 12:05:25 09/21/04
Go up one level in this thread
<snip>
>So the modified method posted by Russell is much better.
>
>BitBoard rightPawnAttacks(BitBoard pawns, int color)
>{
> return ((bb<<9)>>(color*16)) & notA;
>}
>
>BitBoard leftPawnAttacks(BitBoard pawns, int color)
>{
> return ((bb<<7)>>(color*16)) & notH;
>}
>
>
>Since msc generates a call for "variable" 64-bit shift,
>this inline assembly is probably the fastest to get pawn attacks with color
>param.
Ok, i measured performance with rdtsc, compared to the conditional routine
below. I guess that color is a good pattern to predict for modern cpus.
In the correct predicted case, the conditional routine takes 4 cycles (All AMD64
32-bit mode). It seems in miss-prediction case 20 cycles.
BitBoard rightPawnAttacks(BitBoard pawns, int color)
{
if ( color )
return (pawns>>7) & 0xfefefefefefefefe; // black
return (pawns<<9) & 0xfefefefefefefefe; // white pawns
}
The branchless asm routine is about 9 cycles, both including store of the
result-bitboard and inlined.
Conclusion:
If you really don't have random conditions, branches are not that bad and most
often faster than branchless approaches.
Gerd
>
BitBoard rightPawnAttacks(BitBoard pawns, int color)
{
__asm
{
mov eax,dword ptr [pawns]
mov edx,dword ptr [pawns+4] ; bb in edx:eax
mov ecx, [color] ; white=0, black=1
shld edx,eax,9
shl ecx,4 ; white=0, black=16
shl eax,9 ; bb << 9
shrd eax,edx,cl
shr edx,cl ; bb >> white? 0 : 16
and eax,0FEFEFEFEh ; & notA low board
and edx,0FEFEFEFEh ; & notA high board
// bitboard return via edx:eax
}
}
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.