Author: Walter Faxon
Date: 22:42:31 12/22/02
Go up one level in this thread
On December 22, 2002 at 20:10:45, Matt Taylor wrote:
>snip<
>I've added several tests using your 32-bit routine. It actually works a little
>better, I think. (It also helps that the table is 20 bytes smaller -- but that's
>mostly irrelevant.) I also added the "naive" method which looks roughly like
>this:
>
>int shift_scan(bitboard bb)
>{
> index = 0;
> while(bb != 0 && (bb & 1) == 0)
> {
> index++;
> bb >>= 1;
> }
>
> return index;
>}
>
>One other interesting possibility opens up due to using a 32-bit scan routine
>like that: parallelism! The MMX routine performs poorly when applied to the
>64-bit algorithm because it is difficult to fold at the end. I'm working on a
>32-bit version. Initially it was -almost- the fastest bitscan yet, but then I
>realized that I had a bug. I can do the bitscans in parallel, but I can't
>actually clear both bits -- only the one I return. The current code ends up
>throwing a bit away, cutting the time in half. After I fix the bug, we'll see
>how it changes things.
>
>You know, it would also be interesting also to have a routine return two bits
>and do some unrolling. It would make MMX invaluable as you could theoretically
>grab 8 bits at a time.
>
>-Matt
Hey, Matt.
Well, as long as you're adding other test code, below is one more 32-bit
routine, adapted from a post by Gerd last June (in fact, made super-pedantic).
No table required.
-- Walter
Code follows:
// ---------------------------------------------------------------------------
// LSB_32x() -- find, remove, report least-significant bit of 32.
// Argument must be non-null. Method: convert conditional tests to 0/1.
// Based on 64-bit code posted to CCC by Gerd Isenberg, June 18, 2002.
// Written by Walter Faxon, December 2002. No copyright. No warranty.
//
inline // may differ by compiler
int
LSB_32x( u32* value_p )
{
u32 copy, bit, index;
copy = *value_p; // get 32-bit value
bit = -copy; // get LSB as single bit..
//
// NOTE: at this point, normal assembly code would leave the
// zero flag set when the input (*value_p == 0). If a "break" or
// "goto" on that flag is possible, one could for example, loop
// without any leading test for (*value_p != 0).
//
bit &= copy; // ..done
copy ^= bit; // remove from value
*value_p = copy; // and store back
// convert conditional tests to 0/1, building binary number
index = ((bit & 0xFFFF0000) != 0);
index <<= 1;
index |= ((bit & 0xFF00FF00) != 0);
index <<= 1;
index |= ((bit & 0xF0F0F0F0) != 0);
index <<= 1;
index |= ((bit & 0xCCCCCCCC) != 0);
index <<= 1;
index |= ((bit & 0xAAAAAAAA) != 0);
return index;
}
//eof
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.