Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Bit Scan Timings

Author: Walter Faxon

Date: 02:16:18 12/21/02

Go up one level in this thread


On December 21, 2002 at 02:40:42, Matt Taylor wrote:

>OK, after a bit of work cleaning up an undergrad project (the buggy source
>highlighter that I used to convert the code -> html), I've got the code up.
>
>Timings:
>http://65.186.75.165/~mtaylor/bitscan_athlon.html
>
>Code:
>http://65.186.75.165/~mtaylor/ipcbench.c.html
>http://65.186.75.165/~mtaylor/bitscan.c.html
>
>I'll run results for an original K6 tomorrow just as a matter of curiousity. I
>don't have access to -ANY- Intel machines right now, so I can't get Intel data.
>If anyone wants to help out, I can toss the binary up on FTP. Offhand I don't
>think it will run terribly well on a P4 and possibly not very well on previous
>Intel processors.
>
>-Matt


Matt -- very interesting!

One possible improvement that doesn't translate to pure C, is the note in the
following:

// ---------------------------------------------------------------------------
// LSB_64_r1() -- find, remove, report least-significant bit of 64.
// Argument 'bb' must be non-null.  Method:  fold then table lookup.
// Written by Walter Faxon, June 2002.  No copyright.  No warranty.
//
inline                  // inline declaration may differ by compiler
u8 LSB_64_r1( u64* bb )
    {
    u64 t64;
    u32 t32;
    u8 t8;
    t64 = *bb - 1;
    //
    // NOTE:  at this point, normal assembly code would leave the
    // carry flag set when the input (bb == 0).  If a "break" or
    // "goto" on that flag is possible, one could for example, loop
    // without any leading test for (bb != 0).
    //
    *bb &= t64;         // comment here earlier was artifact
    t64 ^= *bb;
    t32 = (u32)t64 ^ (u32)(t64 >> 32);
    t32 ^= LSB_64_magic;
    t32 += t32 >> 16;
    t8 = (u8)t32 - (u8)(t32 >> 8);      // regs addressable by byte
    return LSB_64_table [LSB_64_adj + t8];
    }

Of course, it might also help in some other code.

Question:  would using a pure-C routine to remove/report a bit in a 32-bit word
just be better?  It would require an additional loop, so at least one additional
variable, so I thought it best to skip it.  But you wouldn't have to update both
halves each run through.  That'd save some time and registers, at a premium
here.  I can post the 32-bit code and table if you like.

Or maybe we're just flogging this one little function to death... :)

-- Walter



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.