Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A data point for PowerPC bitboard program authors

Author: Steven Edwards

Date: 04:23:19 05/10/05

Go up one level in this thread


On May 10, 2005 at 05:41:03, Tord Romstad wrote:

>The cntlzw and cntlzdw instructions are certainly worth a look for
>bitboarders who run their engines on PowerPC CPUs, but comparing the speed to
>the table lookup method is not very interesting.  I've found that table lookup
>is the slowest of all the common bit scanning techniques, on the G4 as well as
>the G5.
>
>I use the deBruijn multiplication trick:
>
>const uint32 BitTable[64] = {
>  0,1,2,7,3,21,16,35,4,49,22,52,17,66,36,80,5,33,50,70,23,86,53,96,18,55,67,
>  102,37,98,81,113,119,6,20,34,48,51,65,71,32,69,85,87,54,101,97,112,118,19,
>  39,64,68,84,100,103,117,38,83,99,116,82,115,114
>};
>
>inline unsigned first_1(bitboard_t b) {
>  return BitTable[((b&-b)*0x218a392cd3d5dbfULL)>>58];
>}
>
>I no longer remember exactly how big the difference in speed between this
>and the cntlzdw instruction was, but I remember that it was so tiny that
>there was no point in using inline assembly language.  As always, YMMV.

Perhaps on a G5, but for the 32 bit G4 the above four 64 operations [-, &, *,
>>] have to be split up (the multiply in particular) and with the table
reference added, it doesn't look that good.  Also, the above does not map to the
-1 + 0..63 which I need.  Maybe you are using a 10x12 board like I first saw in
the 1978 Sargon.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.