Computer Chess Club Archives


Search

Terms

Messages

Subject: just another reverse bitscan

Author: Gerd Isenberg

Date: 11:25:24 12/22/05


Hi,

for all bitScan-collectors - just another one ;-)

It is not particular small, but portable, branchless, has cheap instructions and
no memory lookup. It seems faster than those based on double conversion (inlined
24 versus 40 amd64 cycles (32-bit mode, measured by rdtsc).

unsigned int bitScanReverse(BitBoard bb)
{
  unsigned int l, h, i, m;
  h   = (unsigned int)(bb >> 32);
  l   = (unsigned int) bb;
  m   = h != 0;
  i   = m << 5;             // 0|32
  l   = (h & -m) | (l & m-1);
  m   = (l > 0xffff) << 4;  // 0|16
  i  += m;
  l >>= m;
  m   = ((0xff-l)>>16) & 8; // 0|8
  i  += m;
  l >>= m;
  m   = ((0x0f-l)>> 8) & 4; // 0|4
  l >>= m;
  return i + m + ((0xffffaa50u >> (2*l)) & 3);
}

It is based on the conditional "divide and conquer" one, posted here by Eugene
Nalimov. The folding is implemented branchless, eg. the word/byte folding:

if (l & 0xff00) {
  i += 8;
  l >>= 8;
}

m   = (l > 0xff) << 3;
i  += m;
l >>= m;

or

m   = ((0xff-l)>>16) & 8;
i  += m;
l >>= m;

While Eugene used a final lookup[foldedByte]
or with only 16-bit folding lookup[foldedWord],
here the final pre-lookup step is folding from byte to nibble for an
in-register-lookup with 2bit[16], as suggested by Paul Womack in comp.arch on
May 15, 2000.

Gerd



This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.