Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: just another reverse bitscan

Author: Stuart Cracraft

Date: 14:47:21 12/22/05

Go up one level in this thread


On December 22, 2005 at 16:29:40, Gerd Isenberg wrote:

>>Is this faster than Bob's bit routine in Crafty?
>>
>>Anyone compared?
>
>I guess not - it might be a nice portable backup.
>
>Iirc Crafty's FirstOne is based on leading zero count, so this bsr replacement
>is used very often with 63-bitScanReverse or so. It should be inlined for speed
>issues - but codesize will grow. Since the bitScan-intrinsic produces only one
>64-bit instruction - bsr reg64,reg64 which are only a 4 bytes instead of 100 or
>so for the C-routine. The codesize will explode a bit - if used often - which is
>likely in Crafty.
>
>Otoh the bsr reg64,reg64 is 9 cycles vector path. The let say 24 cycles of the
>C-routine may be scheduled with other branchless but more memory transfer
>instructions around - to improve ipc of the otherwise very dependent instruction
>chain with about only one instruction per cycle.
>
>If the routine is used rarely as in my case, (i traverse the boards the other
>way around) - it might be usefull.
>
>Out of curiosity - in my 32-bit program the routine with takes the largest
>number of 64 cycles, behaves best in my program - negligible and may be noise -
>however:
>
>// 64 cycles
>unsigned int bitScanReverse(BitBoard bb) {
>   union {
>      double d;
>      struct {
>         unsigned int mantissal : 32;
>         unsigned int mantissah : 20;
>         unsigned int exponent : 11;
>         unsigned int sign : 1;
>      };
>   } ud;
>   ud.d = (double)(bb & ~(bb >> 32));
>   return ud.exponent - 1023;
>}
>
>the worst one:
>
>// 30 cycles - implies parameter via memory even if inlined
>unsigned int bitScanReverse(BitBoard bb) {
>  __asm  {
>    bsr eax,[bb]
>    bsr eax,[bb+4]
>    setnz dl
>    shl	dl, 5
>    add al, dl
>  }
>}
>
>Gerd

Interesting. My program uses this borrowed from the great Crafty
years ago, probably not the same as more recent FirstOne():

#ifdef MS_COMPILER
typedef unsigned __int64 BitBoard;
#else
typedef unsigned long long BitBoard;
#endif

#ifdef BITS
unsigned char lzArray[65536];
unsigned char BitCount[65536];
#ifdef MS_COMPILER
__forceinline static unsigned char leadz (BitBoard b)
#else
inline static unsigned char leadz (BitBoard b)
#endif
/**************************************************************************
*
*  Returns the leading bit in a bitboard.  Leftmost bit is 0 and
*  rightmost bit is 63.  Thanks to Robert Hyatt for this algorithm.
*
***************************************************************************/
{
  if (b >> 48) return lzArray[b >> 48];
  if (b >> 32) return lzArray[b >> 32] + 16;
  if (b >> 16) return lzArray[b >> 16] + 32;
  return lzArray[b] + 48;
}

#define NBITS 16
void initlzarray (void)
/***************************************************************************
*
*  The lzArray is created.  This array is used when the position
*  of the leading non-zero bit is required.  The convention used
*  is that the leftmost bit is considered as position 0 and the
*  rightmost bit position 63.
*
***************************************************************************/
{
  int i, j, s, n;

  s = n = 1;
  for (i = 0; i < NBITS; i++)
  {
    for (j = s; j < s + n; j++)
      lzArray[j] = NBITS - 1 - i;
    s += n;
    n += n;
  }
}



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.