Author: Walter Faxon
Date: 22:42:31 12/22/02
Go up one level in this thread
On December 22, 2002 at 20:10:45, Matt Taylor wrote:
>snip<
>I've added several tests using your 32-bit routine. It actually works a little
>better, I think. (It also helps that the table is 20 bytes smaller -- but that's
>mostly irrelevant.) I also added the "naive" method which looks roughly like
>this:
>
>int shift_scan(bitboard bb)
>{
>    index = 0;
>    while(bb != 0 && (bb & 1) == 0)
>    {
>        index++;
>        bb >>= 1;
>    }
>
>    return index;
>}
>
>One other interesting possibility opens up due to using a 32-bit scan routine
>like that: parallelism! The MMX routine performs poorly when applied to the
>64-bit algorithm because it is difficult to fold at the end. I'm working on a
>32-bit version. Initially it was -almost- the fastest bitscan yet, but then I
>realized that I had a bug. I can do the bitscans in parallel, but I can't
>actually clear both bits -- only the one I return. The current code ends up
>throwing a bit away, cutting the time in half. After I fix the bug, we'll see
>how it changes things.
>
>You know, it would also be interesting also to have a routine return two bits
>and do some unrolling. It would make MMX invaluable as you could theoretically
>grab 8 bits at a time.
>
>-Matt
Hey, Matt.
Well, as long as you're adding other test code, below is one more 32-bit
routine, adapted from a post by Gerd last June (in fact, made super-pedantic).
No table required.
-- Walter
Code follows:
// ---------------------------------------------------------------------------
// LSB_32x() -- find, remove, report least-significant bit of 32.
// Argument must be non-null.  Method:  convert conditional tests to 0/1.
// Based on 64-bit code posted to CCC by Gerd Isenberg, June 18, 2002.
// Written by Walter Faxon, December 2002.  No copyright.  No warranty.
//
inline                      // may differ by compiler
int
LSB_32x( u32* value_p )
    {
    u32 copy, bit, index;
    copy = *value_p;        // get 32-bit value
    bit = -copy;            // get LSB as single bit..
    //
    // NOTE:  at this point, normal assembly code would leave the
    // zero flag set when the input (*value_p == 0).  If a "break" or
    // "goto" on that flag is possible, one could for example, loop
    // without any leading test for (*value_p != 0).
    //
    bit &= copy;            // ..done
    copy ^= bit;            // remove from value
    *value_p = copy;        // and store back
    // convert conditional tests to 0/1, building binary number
    index  = ((bit & 0xFFFF0000) != 0);
    index <<= 1;
    index |= ((bit & 0xFF00FF00) != 0);
    index <<= 1;
    index |= ((bit & 0xF0F0F0F0) != 0);
    index <<= 1;
    index |= ((bit & 0xCCCCCCCC) != 0);
    index <<= 1;
    index |= ((bit & 0xAAAAAAAA) != 0);
    return index;
    }
//eof
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.