Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A data point for PowerPC bitboard program authors

Author: Tord Romstad

Date: 02:41:03 05/10/05

Go up one level in this thread


On May 09, 2005 at 07:48:51, Steven Edwards wrote:

>Using the above code vs. the table lookup model results in a speed increase of
>about 16% when running movepath enumeration (i.e., perft).  Overall speedup is
>somewhat less although still significant.

The cntlzw and cntlzdw instructions are certainly worth a look for
bitboarders who run their engines on PowerPC CPUs, but comparing the speed to
the table lookup method is not very interesting.  I've found that table lookup
is the slowest of all the common bit scanning techniques, on the G4 as well as
the G5.

I use the deBruijn multiplication trick:

const uint32 BitTable[64] = {
  0,1,2,7,3,21,16,35,4,49,22,52,17,66,36,80,5,33,50,70,23,86,53,96,18,55,67,
  102,37,98,81,113,119,6,20,34,48,51,65,71,32,69,85,87,54,101,97,112,118,19,
  39,64,68,84,100,103,117,38,83,99,116,82,115,114
};

inline unsigned first_1(bitboard_t b) {
  return BitTable[((b&-b)*0x218a392cd3d5dbfULL)>>58];
}

I no longer remember exactly how big the difference in speed between this
and the cntlzdw instruction was, but I remember that it was so tiny that
there was no point in using inline assembly language.  As always, YMMV.

Tord



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.