Author: Tony Werten
Date: 04:30:41 05/10/05
Go up one level in this thread
On May 10, 2005 at 07:23:19, Steven Edwards wrote:
>On May 10, 2005 at 05:41:03, Tord Romstad wrote:
>
>>The cntlzw and cntlzdw instructions are certainly worth a look for
>>bitboarders who run their engines on PowerPC CPUs, but comparing the speed to
>>the table lookup method is not very interesting. I've found that table lookup
>>is the slowest of all the common bit scanning techniques, on the G4 as well as
>>the G5.
>>
>>I use the deBruijn multiplication trick:
>>
>>const uint32 BitTable[64] = {
>> 0,1,2,7,3,21,16,35,4,49,22,52,17,66,36,80,5,33,50,70,23,86,53,96,18,55,67,
>> 102,37,98,81,113,119,6,20,34,48,51,65,71,32,69,85,87,54,101,97,112,118,19,
>> 39,64,68,84,100,103,117,38,83,99,116,82,115,114
>>};
>>
>>inline unsigned first_1(bitboard_t b) {
>> return BitTable[((b&-b)*0x218a392cd3d5dbfULL)>>58];
>>}
>>
>>I no longer remember exactly how big the difference in speed between this
>>and the cntlzdw instruction was, but I remember that it was so tiny that
>>there was no point in using inline assembly language. As always, YMMV.
>
>Perhaps on a G5, but for the 32 bit G4 the above four 64 operations [-, &, *,
>>>] have to be split up (the multiply in particular) and with the table
>reference added, it doesn't look that good. Also, the above does not map to the
>-1 + 0..63 which I need. Maybe you are using a 10x12 board like I first saw in
>the 1978 Sargon.
No it's 0x88 (8*16)
Tony
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.