Author: Stuart Cracraft
Date: 14:47:21 12/22/05
Go up one level in this thread
On December 22, 2005 at 16:29:40, Gerd Isenberg wrote:
>>Is this faster than Bob's bit routine in Crafty?
>>
>>Anyone compared?
>
>I guess not - it might be a nice portable backup.
>
>Iirc Crafty's FirstOne is based on leading zero count, so this bsr replacement
>is used very often with 63-bitScanReverse or so. It should be inlined for speed
>issues - but codesize will grow. Since the bitScan-intrinsic produces only one
>64-bit instruction - bsr reg64,reg64 which are only a 4 bytes instead of 100 or
>so for the C-routine. The codesize will explode a bit - if used often - which is
>likely in Crafty.
>
>Otoh the bsr reg64,reg64 is 9 cycles vector path. The let say 24 cycles of the
>C-routine may be scheduled with other branchless but more memory transfer
>instructions around - to improve ipc of the otherwise very dependent instruction
>chain with about only one instruction per cycle.
>
>If the routine is used rarely as in my case, (i traverse the boards the other
>way around) - it might be usefull.
>
>Out of curiosity - in my 32-bit program the routine with takes the largest
>number of 64 cycles, behaves best in my program - negligible and may be noise -
>however:
>
>// 64 cycles
>unsigned int bitScanReverse(BitBoard bb) {
> union {
> double d;
> struct {
> unsigned int mantissal : 32;
> unsigned int mantissah : 20;
> unsigned int exponent : 11;
> unsigned int sign : 1;
> };
> } ud;
> ud.d = (double)(bb & ~(bb >> 32));
> return ud.exponent - 1023;
>}
>
>the worst one:
>
>// 30 cycles - implies parameter via memory even if inlined
>unsigned int bitScanReverse(BitBoard bb) {
> __asm {
> bsr eax,[bb]
> bsr eax,[bb+4]
> setnz dl
> shl dl, 5
> add al, dl
> }
>}
>
>Gerd
Interesting. My program uses this borrowed from the great Crafty
years ago, probably not the same as more recent FirstOne():
#ifdef MS_COMPILER
typedef unsigned __int64 BitBoard;
#else
typedef unsigned long long BitBoard;
#endif
#ifdef BITS
unsigned char lzArray[65536];
unsigned char BitCount[65536];
#ifdef MS_COMPILER
__forceinline static unsigned char leadz (BitBoard b)
#else
inline static unsigned char leadz (BitBoard b)
#endif
/**************************************************************************
*
* Returns the leading bit in a bitboard. Leftmost bit is 0 and
* rightmost bit is 63. Thanks to Robert Hyatt for this algorithm.
*
***************************************************************************/
{
if (b >> 48) return lzArray[b >> 48];
if (b >> 32) return lzArray[b >> 32] + 16;
if (b >> 16) return lzArray[b >> 16] + 32;
return lzArray[b] + 48;
}
#define NBITS 16
void initlzarray (void)
/***************************************************************************
*
* The lzArray is created. This array is used when the position
* of the leading non-zero bit is required. The convention used
* is that the leftmost bit is considered as position 0 and the
* rightmost bit position 63.
*
***************************************************************************/
{
int i, j, s, n;
s = n = 1;
for (i = 0; i < NBITS; i++)
{
for (j = s; j < s + n; j++)
lzArray[j] = NBITS - 1 - i;
s += n;
n += n;
}
}
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.