Author: Anthony Cozzie
Date: 09:15:30 10/17/03
Go up one level in this thread
On October 17, 2003 at 09:54:34, Vladimir Medvedev wrote:
>>I fear your switch case with single isolated LS1B is not optimizable as a jump
>>table, but works like 64 nested "if then elses". Not sure about how the branch
>>prediction works here, but i guess the squares on the "black" side are a bit
>>handicapped ;-)
>>
>>I suggest a lookup-approach (similar to your popcount), Walter Faxon's genius
>>folding trick, Matt Taylor's de Bruijn approach or several assembler routines
>>using 32-bit bsf, all already posted here in CCC x times.
>
>When I switched to bitboards from 0x88 representation I run some tests (with
>Win32 version), trying to use lookup table for LSB too. The results were (for
>set of random 64-bit numbers generated by my Randon64):
>
>1. lookup is faster than switch() for PopCount
>2. switch() is faster than lookup for LSB
>
>But I can't remember exact digits now...
That is because your test is flawed ;)
Bitboards in chess are usually sparsely populated (3-4 ones, a bunch of zeroes).
In a random number, there is a 50% chance that the first bit is 1, a 75% chance
that the first 1-2 bits are 1, etc. In other words, your tree was OK because
you never had to traverse it.
I liked the program though - very readable source, good clean OO design. How
strong is it, and how many nodes/sec does it search?
anthony
P.S. This Zappa's bitscan code - stolen from Walter Faxon & Matt Taylor. Taylor
uses a multiply while faxon uses several shifts and adds. I think the multiply
is slower on P4 but faster on Athlon.
---------------------------------------------------------------------------
//Table reports number of low-order bit as 0, high-order as 63.
//Faxon style constants for constant folding
const zappa_u8 LSB_64_table[154] =
{23,0,0,0,31,0,0,39,19,0,17,16,18,0,47,10,20,9,8,11,
1,0,2,57,56,58,3,12,0,59,0,0,21,0,4,0,0,60,0,0,0,0,
0,13,0,0,0,0,0,0,5,0,0,61,0,0,0,0,0,0,0,0,0,0,22,0,
0,0,30,0,0,38,0,0,0,14,0,0,46,0,0,0,6,0,0,62,0,0,0,
54,0,0,0,0,0,0,0,0,0,0,29,0,0,37,0,0,0,0,0,0,45,0,
0,0,0,0,28,0,0,36,0,53,0,0,27,0,44,35,26,24,25,34,
32,33,43,40,41,52,42,15,0,50,48,49,0,51,7,0,0,63,0,
0,0,55};
#define LSB_64_adj -51 // offset to table base
#define LSB_64_magic ((zappa_u32)0x01C5FC81) // magic constant
//Taylor style constants for constant folding (of course, leading zero count and
least significant bit are the same)
const zappa_u8 LZC_64_table[64] = { 0, 31, 4, 33, 60, 15, 12, 34, 61, 25, 51,
10, 56, 20, 22, 35,
62, 30, 3, 54, 52, 24, 42, 19, 57, 29, 2, 44, 47, 28, 1, 36,
63, 32, 59, 5, 6, 50, 55, 7, 16, 53, 13, 41, 8, 43, 46, 17,
26, 58, 49, 14, 11, 40, 9, 45, 21, 48, 39, 23, 18, 38, 37, 27};
#define LZC_64_magic ((zappa_u32)2015959759)
//#define LSB_TAYLOR_STYLE
#define LSB_FAXON_STYLE
//Return the location of the first one in a 1-bit. Uses
//De Bruijn polynomials for constant folding + hash function + table lookup.
//Options are the Matt Taylor multiply fold or the Walter Faxon xor/shift/add
fold.
int lead_one(zappa_u64 bb)
{
zappa_u64 t64;
zappa_u32 t32;
t64 = bb - 1;
bb &= t64;
t64 ^= bb;
t32 = (zappa_u32)t64 ^ (zappa_u32)(t64 >> 32);
#ifdef LSB_TAYLOR_STYLE
t32 *= LZC_64_magic;
return LZC_64_table[(t32 >> 26) & 63];
#endif
#ifdef LSB_FAXON_STYLE
t32 ^= LSB_64_magic;
t32 += t32 >> 16;
t32 -= t32 >> 8;
return (int)LSB_64_table [LSB_64_adj + (zappa_u8)t32];
#endif
}
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.