Author: Gerd Isenberg
Date: 12:43:55 04/08/04
Go up one level in this thread
On April 08, 2004 at 14:46:04, Dann Corbit wrote:
<snip>
>>>How many instructions to perform the state packing?
>>>
>>>mask = (1 << bitIndex2remove) - 1; // bitIndex2remove 0..7
>>>state = eightBitState & mask;
>>>packed = (eightBitState >> 1) & ~mask; // unsigned shift msb is zero
>>>packedState = packed|state;
>>>
>>>Six/seven instructions.
>>
>>Ok, or one memory access ;-)
>>
>>packedState = somePrecalc[eightBitState][bitIndex2remove];
>
>Yes. Also, you were concerned about the transition to move lists already.
>But I also have the bitmaps returned, so I have that data too.
I see - sometimes redundancy pays off ;-)
But if you look for sets of properties, like giving check, (SEE) safe move,
putting pieces en prise, attacking moves etc., as i do with bitwise operations,
and you like to assign some scores to the moves based on that sets...
>
>The bitmaps are based on all the pieces on the board and are the attacked
>squares (including empty ones, ones with enemy pieces and ones with my pieces)
>and also shadow bitmaps (which are squares attacked up to and including the next
>piece on the ray behind the first attacked one -- if any).
>
>>>
>>>What about that one:
>>>Combination of rotated and reversed bitboards (4*2 == 8 occupied boards to
>>>update) and to use some bytewise simd instructions performing the xor minus two
>>>trick, with 128-bit registers simultaniously for white and black:
>>>
>>>// xmm3 occupied:occupied
>>>// xmm1 brooks:wrooks
>>>movq xmm7, xmm3 ; occupied
>>
>>oups movdqa for xmm regsiters
>>
>>movdqa xmm7, xmm3 ; occupied
>>psubb xmm3, xmm1 ; occupied - rooks
>>psubb xmm3, xmm1 ; occupied - 2*rooks
>>pxor xmm3, xmm7 ; rightattacks := occupied ^ (occupied - 2*rooks)
>
>I don't use any assembly at all. I compile and link on all kinds of crazy
>machines including Tru64 Alpha and IBM mainframe. Assembly would just get in
>the way.
I use 128-bit wrapper classes for that in the future with overloaded operator
functions. A (conditional compiled) class template parameter determines the
register representation of a routine (general purpose, XMM-register for P4 and
AMD64 or some AltiVec register on G5):
template <class T>
void xxxAttacks(sTarget* pTarget, const sSource* pSource)
{
T occupied(&pSource->occup);
T rooks(&pSource->rooks);
T rightAttacks ( occupied ^ (occupied - rooks - rooks) );
rightAttacks.store(&pTarget->right);
...
}
Anyway, implementing bytewise sub (16 times) with general porpuse registers is
not as efficient as with bytewise SIMD-instructions. Two subs with alternately
masked bytes...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.