Author: Stuart Cracraft
Date: 14:47:21 12/22/05
Go up one level in this thread
On December 22, 2005 at 16:29:40, Gerd Isenberg wrote: >>Is this faster than Bob's bit routine in Crafty? >> >>Anyone compared? > >I guess not - it might be a nice portable backup. > >Iirc Crafty's FirstOne is based on leading zero count, so this bsr replacement >is used very often with 63-bitScanReverse or so. It should be inlined for speed >issues - but codesize will grow. Since the bitScan-intrinsic produces only one >64-bit instruction - bsr reg64,reg64 which are only a 4 bytes instead of 100 or >so for the C-routine. The codesize will explode a bit - if used often - which is >likely in Crafty. > >Otoh the bsr reg64,reg64 is 9 cycles vector path. The let say 24 cycles of the >C-routine may be scheduled with other branchless but more memory transfer >instructions around - to improve ipc of the otherwise very dependent instruction >chain with about only one instruction per cycle. > >If the routine is used rarely as in my case, (i traverse the boards the other >way around) - it might be usefull. > >Out of curiosity - in my 32-bit program the routine with takes the largest >number of 64 cycles, behaves best in my program - negligible and may be noise - >however: > >// 64 cycles >unsigned int bitScanReverse(BitBoard bb) { > union { > double d; > struct { > unsigned int mantissal : 32; > unsigned int mantissah : 20; > unsigned int exponent : 11; > unsigned int sign : 1; > }; > } ud; > ud.d = (double)(bb & ~(bb >> 32)); > return ud.exponent - 1023; >} > >the worst one: > >// 30 cycles - implies parameter via memory even if inlined >unsigned int bitScanReverse(BitBoard bb) { > __asm { > bsr eax,[bb] > bsr eax,[bb+4] > setnz dl > shl dl, 5 > add al, dl > } >} > >Gerd Interesting. My program uses this borrowed from the great Crafty years ago, probably not the same as more recent FirstOne(): #ifdef MS_COMPILER typedef unsigned __int64 BitBoard; #else typedef unsigned long long BitBoard; #endif #ifdef BITS unsigned char lzArray[65536]; unsigned char BitCount[65536]; #ifdef MS_COMPILER __forceinline static unsigned char leadz (BitBoard b) #else inline static unsigned char leadz (BitBoard b) #endif /************************************************************************** * * Returns the leading bit in a bitboard. Leftmost bit is 0 and * rightmost bit is 63. Thanks to Robert Hyatt for this algorithm. * ***************************************************************************/ { if (b >> 48) return lzArray[b >> 48]; if (b >> 32) return lzArray[b >> 32] + 16; if (b >> 16) return lzArray[b >> 16] + 32; return lzArray[b] + 48; } #define NBITS 16 void initlzarray (void) /*************************************************************************** * * The lzArray is created. This array is used when the position * of the leading non-zero bit is required. The convention used * is that the leftmost bit is considered as position 0 and the * rightmost bit position 63. * ***************************************************************************/ { int i, j, s, n; s = n = 1; for (i = 0; i < NBITS; i++) { for (j = s; j < s + n; j++) lzArray[j] = NBITS - 1 - i; s += n; n += n; } }
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.