Author: Tony Werten
Date: 04:30:41 05/10/05
Go up one level in this thread
On May 10, 2005 at 07:23:19, Steven Edwards wrote: >On May 10, 2005 at 05:41:03, Tord Romstad wrote: > >>The cntlzw and cntlzdw instructions are certainly worth a look for >>bitboarders who run their engines on PowerPC CPUs, but comparing the speed to >>the table lookup method is not very interesting. I've found that table lookup >>is the slowest of all the common bit scanning techniques, on the G4 as well as >>the G5. >> >>I use the deBruijn multiplication trick: >> >>const uint32 BitTable[64] = { >> 0,1,2,7,3,21,16,35,4,49,22,52,17,66,36,80,5,33,50,70,23,86,53,96,18,55,67, >> 102,37,98,81,113,119,6,20,34,48,51,65,71,32,69,85,87,54,101,97,112,118,19, >> 39,64,68,84,100,103,117,38,83,99,116,82,115,114 >>}; >> >>inline unsigned first_1(bitboard_t b) { >> return BitTable[((b&-b)*0x218a392cd3d5dbfULL)>>58]; >>} >> >>I no longer remember exactly how big the difference in speed between this >>and the cntlzdw instruction was, but I remember that it was so tiny that >>there was no point in using inline assembly language. As always, YMMV. > >Perhaps on a G5, but for the 32 bit G4 the above four 64 operations [-, &, *, >>>] have to be split up (the multiply in particular) and with the table >reference added, it doesn't look that good. Also, the above does not map to the >-1 + 0..63 which I need. Maybe you are using a 10x12 board like I first saw in >the 1978 Sargon. No it's 0x88 (8*16) Tony
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.