Author: Gerd Isenberg
Date: 07:19:23 05/09/05
Go up one level in this thread
On May 09, 2005 at 07:48:51, Steven Edwards wrote: >The PowerPC architecture has an instruction for counting the leading zero bits >in a word. This is either a 32 bit or a 64 bit operation depening on the CPU >mdel and mode in use. Naturally, this instruction can be helpful for optimizing >the FirstSq and NextSq functions. Unfortunately, there is no PowerPC >instruction for bit counting. > >Here's the sample wrapper for the leading zero bit counter from a GCC header >file: > >static inline int __cntlzw (int value) __attribute__((always_inline)); >static inline int >__cntlzw (int value) >{ > long result; > __asm__ ("cntlzw %0, %1" > /* outputs: */ : "=r" (result) > /* inputs: */ : "r" (value)); > return result; >} > >The actual code for FirstSq/NextSq will depend on the specific bit/square >correspondance. For Symbolic's toolkit, the CTBB::NextSq() member function is: > >#if (CTHostMac && CTArchBits32 && CTAllowAssembly) > CTSq NextSq(void) > { > int theZC = __cntlzw(myDwrdVec[0]); > > if (theZC != 32) > { > myDwrdVec[0] ^= 1 << (31 - theZC); > return (CTSq) (theZC ^ 0x07); > } > else > { > theZC = __cntlzw(myDwrdVec[1]); > if (theZC != 32) > { > myDwrdVec[1] ^= 1 << (31 - theZC); > return (CTSq) ((theZC ^ 0x07) + 32); > } > else > return CTSqNil; > }; > } >#endif > >Using the above code vs. the table lookup model results in a speed increase of >about 16% when running movepath enumeration (i.e., perft). Overall speedup is >somewhat less although still significant. > >The above "(theZC ^ 0x07)" operations could be removed if the bit/square >correspondance was altered (reversed) to have a-file squares appear in the MSbit >instead of the LSBit. I believe this is how Crafty arranges the bits, but I >haven't tested it. Currently, Crafty does not have assemply language level >optimizations for PowerPC. Possibly some ambitious author could contribute in >that area. Hi Steven, confusion about LSBit and MSBit ;-) Lets take the first rank with eight binary digits for each file. Bit 0, the most right is the least significant bit LSBit. I use the "mirrored" mapping LSB = a1 and MSB = h1, while you seem to map in the "natural" way MSB = a1 and LSB h1? MSBit LSBit 2**0 | | 00000001B 0x01 ; a1 for me - h1 for you 10000000B 0x80 ; h1 for me - a1 for you If you traverse a (-1) bitboard (all bits set) with that routine you receive a kind of "sawtooth": 56,57,..,63, 48,47,..,55,.,..,..,0,1,..,7. Did i get that correct? May be a bit pedantic to save a subtract and to shift INT_MIN logical right ;-) myDwrdVec[0] ^= (unsigned)(1<<31) >> theZC; // >>> instead of myDwrdVec[0] ^= 1 << (31 - theZC); Thanks, Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.