Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: GreKo for Linux and FreeBSD

Author: Anthony Cozzie

Date: 09:15:30 10/17/03

Go up one level in this thread


On October 17, 2003 at 09:54:34, Vladimir Medvedev wrote:

>>I fear your switch case with single isolated LS1B is not optimizable as a jump
>>table, but works like 64 nested "if then elses". Not sure about how the branch
>>prediction works here, but i guess the squares on the "black" side are a bit
>>handicapped ;-)
>>
>>I suggest a lookup-approach (similar to your popcount), Walter Faxon's genius
>>folding trick, Matt Taylor's de Bruijn approach or several assembler routines
>>using 32-bit bsf, all already posted here in CCC x times.
>
>When I switched to bitboards from 0x88 representation I run some tests (with
>Win32 version), trying to use lookup table for LSB too. The results were (for
>set of random 64-bit numbers generated by my Randon64):
>
>1. lookup is faster than switch() for PopCount
>2. switch() is faster than lookup for LSB
>
>But I can't remember exact digits now...

That is because your test is flawed ;)

Bitboards in chess are usually sparsely populated (3-4 ones, a bunch of zeroes).
 In a random number, there is a 50% chance that the first bit is 1, a 75% chance
that the first 1-2 bits are 1, etc.  In other words, your tree was OK because
you never had to traverse it.

I liked the program though - very readable source, good clean OO design.  How
strong is it, and how many nodes/sec does it search?

anthony

P.S. This Zappa's bitscan code - stolen from Walter Faxon & Matt Taylor.  Taylor
uses a multiply while faxon uses several shifts and adds.  I think the multiply
is slower on P4 but faster on Athlon.

---------------------------------------------------------------------------
//Table reports number of low-order bit as 0, high-order as 63.
//Faxon style constants for constant folding
const zappa_u8 LSB_64_table[154] =
{23,0,0,0,31,0,0,39,19,0,17,16,18,0,47,10,20,9,8,11,
				    1,0,2,57,56,58,3,12,0,59,0,0,21,0,4,0,0,60,0,0,0,0,
				    0,13,0,0,0,0,0,0,5,0,0,61,0,0,0,0,0,0,0,0,0,0,22,0,
				    0,0,30,0,0,38,0,0,0,14,0,0,46,0,0,0,6,0,0,62,0,0,0,
				    54,0,0,0,0,0,0,0,0,0,0,29,0,0,37,0,0,0,0,0,0,45,0,
				    0,0,0,0,28,0,0,36,0,53,0,0,27,0,44,35,26,24,25,34,
				    32,33,43,40,41,52,42,15,0,50,48,49,0,51,7,0,0,63,0,
				    0,0,55};

#define LSB_64_adj  -51                         // offset to table base
#define LSB_64_magic  ((zappa_u32)0x01C5FC81)       // magic constant

//Taylor style constants for constant folding (of course, leading zero count and
least significant bit are the same)
const zappa_u8 LZC_64_table[64] = { 0, 31, 4, 33, 60, 15, 12, 34, 61, 25, 51,
10, 56, 20, 22, 35,
				    62, 30, 3, 54, 52, 24, 42, 19, 57, 29, 2, 44, 47, 28, 1, 36,
				    63, 32, 59, 5, 6, 50, 55, 7, 16, 53, 13, 41, 8, 43, 46, 17,
				    26, 58, 49, 14, 11, 40, 9, 45, 21, 48, 39, 23, 18, 38, 37, 27};

#define LZC_64_magic ((zappa_u32)2015959759)

//#define LSB_TAYLOR_STYLE
#define LSB_FAXON_STYLE

//Return the location of the first one in a 1-bit.  Uses
//De Bruijn polynomials for constant folding + hash function + table lookup.
//Options are the Matt Taylor multiply fold or the Walter Faxon xor/shift/add
fold.
int lead_one(zappa_u64 bb)
{
  zappa_u64 t64;
  zappa_u32 t32;

  t64 = bb - 1;
  bb &= t64;
  t64 ^= bb;
  t32 = (zappa_u32)t64 ^ (zappa_u32)(t64 >> 32);

#ifdef LSB_TAYLOR_STYLE
	t32 *= LZC_64_magic;
	return LZC_64_table[(t32 >> 26) & 63];
#endif

#ifdef LSB_FAXON_STYLE
  t32 ^= LSB_64_magic;
  t32 += t32 >> 16;
  t32 -= t32 >> 8;
  return (int)LSB_64_table [LSB_64_adj + (zappa_u8)t32];
#endif
}



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.