Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: just another reverse bitscan

Author: Gerd Isenberg

Date: 13:29:40 12/22/05

Go up one level in this thread


>Is this faster than Bob's bit routine in Crafty?
>
>Anyone compared?

I guess not - it might be a nice portable backup.

Iirc Crafty's FirstOne is based on leading zero count, so this bsr replacement
is used very often with 63-bitScanReverse or so. It should be inlined for speed
issues - but codesize will grow. Since the bitScan-intrinsic produces only one
64-bit instruction - bsr reg64,reg64 which are only a 4 bytes instead of 100 or
so for the C-routine. The codesize will explode a bit - if used often - which is
likely in Crafty.

Otoh the bsr reg64,reg64 is 9 cycles vector path. The let say 24 cycles of the
C-routine may be scheduled with other branchless but more memory transfer
instructions around - to improve ipc of the otherwise very dependent instruction
chain with about only one instruction per cycle.

If the routine is used rarely as in my case, (i traverse the boards the other
way around) - it might be usefull.

Out of curiosity - in my 32-bit program the routine with takes the largest
number of 64 cycles, behaves best in my program - negligible and may be noise -
however:

// 64 cycles
unsigned int bitScanReverse(BitBoard bb) {
   union {
      double d;
      struct {
         unsigned int mantissal : 32;
         unsigned int mantissah : 20;
         unsigned int exponent : 11;
         unsigned int sign : 1;
      };
   } ud;
   ud.d = (double)(bb & ~(bb >> 32));
   return ud.exponent - 1023;
}

the worst one:

// 30 cycles - implies parameter via memory even if inlined
unsigned int bitScanReverse(BitBoard bb) {
  __asm  {
    bsr eax,[bb]
    bsr eax,[bb+4]
    setnz dl
    shl	dl, 5
    add al, dl
  }
}

Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.