Author: Anthony Cozzie
Date: 09:15:30 10/17/03
Go up one level in this thread
On October 17, 2003 at 09:54:34, Vladimir Medvedev wrote: >>I fear your switch case with single isolated LS1B is not optimizable as a jump >>table, but works like 64 nested "if then elses". Not sure about how the branch >>prediction works here, but i guess the squares on the "black" side are a bit >>handicapped ;-) >> >>I suggest a lookup-approach (similar to your popcount), Walter Faxon's genius >>folding trick, Matt Taylor's de Bruijn approach or several assembler routines >>using 32-bit bsf, all already posted here in CCC x times. > >When I switched to bitboards from 0x88 representation I run some tests (with >Win32 version), trying to use lookup table for LSB too. The results were (for >set of random 64-bit numbers generated by my Randon64): > >1. lookup is faster than switch() for PopCount >2. switch() is faster than lookup for LSB > >But I can't remember exact digits now... That is because your test is flawed ;) Bitboards in chess are usually sparsely populated (3-4 ones, a bunch of zeroes). In a random number, there is a 50% chance that the first bit is 1, a 75% chance that the first 1-2 bits are 1, etc. In other words, your tree was OK because you never had to traverse it. I liked the program though - very readable source, good clean OO design. How strong is it, and how many nodes/sec does it search? anthony P.S. This Zappa's bitscan code - stolen from Walter Faxon & Matt Taylor. Taylor uses a multiply while faxon uses several shifts and adds. I think the multiply is slower on P4 but faster on Athlon. --------------------------------------------------------------------------- //Table reports number of low-order bit as 0, high-order as 63. //Faxon style constants for constant folding const zappa_u8 LSB_64_table[154] = {23,0,0,0,31,0,0,39,19,0,17,16,18,0,47,10,20,9,8,11, 1,0,2,57,56,58,3,12,0,59,0,0,21,0,4,0,0,60,0,0,0,0, 0,13,0,0,0,0,0,0,5,0,0,61,0,0,0,0,0,0,0,0,0,0,22,0, 0,0,30,0,0,38,0,0,0,14,0,0,46,0,0,0,6,0,0,62,0,0,0, 54,0,0,0,0,0,0,0,0,0,0,29,0,0,37,0,0,0,0,0,0,45,0, 0,0,0,0,28,0,0,36,0,53,0,0,27,0,44,35,26,24,25,34, 32,33,43,40,41,52,42,15,0,50,48,49,0,51,7,0,0,63,0, 0,0,55}; #define LSB_64_adj -51 // offset to table base #define LSB_64_magic ((zappa_u32)0x01C5FC81) // magic constant //Taylor style constants for constant folding (of course, leading zero count and least significant bit are the same) const zappa_u8 LZC_64_table[64] = { 0, 31, 4, 33, 60, 15, 12, 34, 61, 25, 51, 10, 56, 20, 22, 35, 62, 30, 3, 54, 52, 24, 42, 19, 57, 29, 2, 44, 47, 28, 1, 36, 63, 32, 59, 5, 6, 50, 55, 7, 16, 53, 13, 41, 8, 43, 46, 17, 26, 58, 49, 14, 11, 40, 9, 45, 21, 48, 39, 23, 18, 38, 37, 27}; #define LZC_64_magic ((zappa_u32)2015959759) //#define LSB_TAYLOR_STYLE #define LSB_FAXON_STYLE //Return the location of the first one in a 1-bit. Uses //De Bruijn polynomials for constant folding + hash function + table lookup. //Options are the Matt Taylor multiply fold or the Walter Faxon xor/shift/add fold. int lead_one(zappa_u64 bb) { zappa_u64 t64; zappa_u32 t32; t64 = bb - 1; bb &= t64; t64 ^= bb; t32 = (zappa_u32)t64 ^ (zappa_u32)(t64 >> 32); #ifdef LSB_TAYLOR_STYLE t32 *= LZC_64_magic; return LZC_64_table[(t32 >> 26) & 63]; #endif #ifdef LSB_FAXON_STYLE t32 ^= LSB_64_magic; t32 += t32 >> 16; t32 -= t32 >> 8; return (int)LSB_64_table [LSB_64_adj + (zappa_u8)t32]; #endif }
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.