Author: Gerd Isenberg
Date: 07:19:23 05/09/05
Go up one level in this thread
On May 09, 2005 at 07:48:51, Steven Edwards wrote:
>The PowerPC architecture has an instruction for counting the leading zero bits
>in a word. This is either a 32 bit or a 64 bit operation depening on the CPU
>mdel and mode in use. Naturally, this instruction can be helpful for optimizing
>the FirstSq and NextSq functions. Unfortunately, there is no PowerPC
>instruction for bit counting.
>
>Here's the sample wrapper for the leading zero bit counter from a GCC header
>file:
>
>static inline int __cntlzw (int value) __attribute__((always_inline));
>static inline int
>__cntlzw (int value)
>{
> long result;
> __asm__ ("cntlzw %0, %1"
> /* outputs: */ : "=r" (result)
> /* inputs: */ : "r" (value));
> return result;
>}
>
>The actual code for FirstSq/NextSq will depend on the specific bit/square
>correspondance. For Symbolic's toolkit, the CTBB::NextSq() member function is:
>
>#if (CTHostMac && CTArchBits32 && CTAllowAssembly)
> CTSq NextSq(void)
> {
> int theZC = __cntlzw(myDwrdVec[0]);
>
> if (theZC != 32)
> {
> myDwrdVec[0] ^= 1 << (31 - theZC);
> return (CTSq) (theZC ^ 0x07);
> }
> else
> {
> theZC = __cntlzw(myDwrdVec[1]);
> if (theZC != 32)
> {
> myDwrdVec[1] ^= 1 << (31 - theZC);
> return (CTSq) ((theZC ^ 0x07) + 32);
> }
> else
> return CTSqNil;
> };
> }
>#endif
>
>Using the above code vs. the table lookup model results in a speed increase of
>about 16% when running movepath enumeration (i.e., perft). Overall speedup is
>somewhat less although still significant.
>
>The above "(theZC ^ 0x07)" operations could be removed if the bit/square
>correspondance was altered (reversed) to have a-file squares appear in the MSbit
>instead of the LSBit. I believe this is how Crafty arranges the bits, but I
>haven't tested it. Currently, Crafty does not have assemply language level
>optimizations for PowerPC. Possibly some ambitious author could contribute in
>that area.
Hi Steven,
confusion about LSBit and MSBit ;-)
Lets take the first rank with eight binary digits for each file. Bit 0, the most
right is the least significant bit LSBit. I use the "mirrored" mapping LSB = a1
and MSB = h1, while you seem to map in the "natural" way MSB = a1 and LSB h1?
MSBit LSBit 2**0
| |
00000001B 0x01 ; a1 for me - h1 for you
10000000B 0x80 ; h1 for me - a1 for you
If you traverse a (-1) bitboard (all bits set) with that routine
you receive a kind of "sawtooth": 56,57,..,63, 48,47,..,55,.,..,..,0,1,..,7.
Did i get that correct?
May be a bit pedantic to save a subtract and to shift INT_MIN logical right ;-)
myDwrdVec[0] ^= (unsigned)(1<<31) >> theZC; // >>>
instead of
myDwrdVec[0] ^= 1 << (31 - theZC);
Thanks,
Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.