Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A data point for PowerPC bitboard program authors

Author: Gerd Isenberg

Date: 07:19:23 05/09/05

Go up one level in this thread


On May 09, 2005 at 07:48:51, Steven Edwards wrote:

>The PowerPC architecture has an instruction for counting the leading zero bits
>in a word.  This is either a 32 bit or a 64 bit operation depening on the CPU
>mdel and mode in use.  Naturally, this instruction can be helpful for optimizing
>the FirstSq and NextSq functions.  Unfortunately, there is no PowerPC
>instruction for bit counting.
>
>Here's the sample wrapper for the leading zero bit counter from a GCC header
>file:
>
>static inline int __cntlzw (int value) __attribute__((always_inline));
>static inline int
>__cntlzw (int value)
>{
>  long result;
>  __asm__ ("cntlzw %0, %1"
>           /* outputs:  */ : "=r" (result)
>           /* inputs:   */ : "r" (value));
>  return result;
>}
>
>The actual code for FirstSq/NextSq will depend on the specific bit/square
>correspondance.  For Symbolic's toolkit, the CTBB::NextSq() member function is:
>
>#if (CTHostMac && CTArchBits32 && CTAllowAssembly)
>  CTSq NextSq(void)
>  {
>    int theZC = __cntlzw(myDwrdVec[0]);
>
>    if (theZC != 32)
>    {
>      myDwrdVec[0] ^= 1 << (31 - theZC);
>      return (CTSq) (theZC ^ 0x07);
>    }
>    else
>    {
>      theZC = __cntlzw(myDwrdVec[1]);
>      if (theZC != 32)
>      {
>        myDwrdVec[1] ^= 1 << (31 - theZC);
>        return (CTSq) ((theZC ^ 0x07) + 32);
>      }
>      else
>        return CTSqNil;
>    };
>  }
>#endif
>
>Using the above code vs. the table lookup model results in a speed increase of
>about 16% when running movepath enumeration (i.e., perft).  Overall speedup is
>somewhat less although still significant.
>
>The above "(theZC ^ 0x07)" operations could be removed if the bit/square
>correspondance was altered (reversed) to have a-file squares appear in the MSbit
>instead of the LSBit.  I believe this is how Crafty arranges the bits, but I
>haven't tested it.  Currently, Crafty does not have assemply language level
>optimizations for PowerPC.  Possibly some ambitious author could contribute in
>that area.


Hi Steven,

confusion about LSBit and MSBit ;-)

Lets take the first rank with eight binary digits for each file. Bit 0, the most
right is the least significant bit LSBit. I use the "mirrored" mapping LSB = a1
and MSB = h1, while you seem to map in the "natural" way MSB = a1 and LSB h1?

MSBit  LSBit 2**0
|      |
00000001B  0x01 ; a1 for me - h1 for you
10000000B  0x80 ; h1 for me - a1 for you

If you traverse a (-1) bitboard (all bits set) with that routine
you receive a kind of "sawtooth": 56,57,..,63, 48,47,..,55,.,..,..,0,1,..,7.
Did i get that correct?

May be a bit pedantic to save a subtract and to shift INT_MIN logical right ;-)

  myDwrdVec[0] ^= (unsigned)(1<<31) >> theZC; // >>>
instead of
  myDwrdVec[0] ^= 1 << (31 - theZC);

Thanks,
Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.