Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: BitScan with reset - not so impressive with 3DNow!

Author: Russell Reagan

Date: 12:50:15 12/03/02

Go up one level in this thread


On December 02, 2002 at 18:49:10, Walter Faxon wrote:

>// ---------------------------------------------------------------------------
>
>typedef unsigned long long  u64;    // nonstandard
>typedef unsigned long       u32;
>typedef unsigned char       u8;
>
>extern const u8 LSB_64_table[154];              // bit number table
>#define LSB_64_adj  -51                         // offset to table base
>#define LSB_64_magic  ( (u32)0x01C5FC81 )       // magic constant
>
>// ---------------------------------------------------------------------------
>// LSB_64() -- find, remove, report least-significant bit of 64.
>// Argument 'bb' must be non-null.  Method:  fold then table lookup.
>// Written by Walter Faxon, June 2002.  No copyright.  No warranty.
>//
>inline                  // inline declaration may differ by compiler
>u8 LSB_64( u64* bb )
>    {
>    u64 t64;
>    u32 t32;
>    t64 = *bb - 1;
>    *bb &= t64;         // omit this line to retain current LSB
>    t64 ^= *bb;
>    t32 = (u32)t64 ^ (u32)(t64 >> 32);
>    t32 ^= LSB_64_magic;
>    t32 += t32 >> 16;
>    t32 -= t32 >> 8;
>    return LSB_64_table [LSB_64_adj + (u8)t32];
>    }
>
>// ---------------------------------------------------------------------------
>// Table reports number of low-order bit as 0, high-order as 63.
>// (Numbering can be reversed by changing this table.)
>// Important:  arrange storage so that this table is kept in the cache.
>const u8 LSB_64_table[154] =
>    {
>#define __  0
>    23,__,__,__,31,__,__,39,19,__, 17,16,18,__,47,10,20, 9, 8,11,
>     1, 0, 2,57,56,58, 3,12,__,59, __,__,21,__, 4,__,__,60,__,__,
>    __,__,__,13,__,__,__,__,__,__,  5,__,__,61,__,__,__,__,__,__,
>    __,__,__,__,22,__,__,__,30,__, __,38,__,__,__,14,__,__,46,__,
>    __,__, 6,__,__,62,__,__,__,54, __,__,__,__,__,__,__,__,__,__,
>    29,__,__,37,__,__,__,__,__,__, 45,__,__,__,__,__,28,__,__,36,
>    __,53,__,__,27,__,44,35,26,24, 25,34,32,33,43,40,41,52,42,15,
>    __,50,48,49,__,51, 7,__,__,63, __,__,__,55
>#undef __
>    };
>
>//eof

Is Athlon's bsf instruction significantly slower than that on a P3? I tried this
code against a bsf version on my P3 733 and bsf kills it, big time. I tried all
of these without the feature of resetting the bit. I'm still amazed by the bsf
time though.

bsf:        4.265 seconds
LSB_64:     42.968 seconds
LastOne():  39.906 seconds
FirstOne(): 55.125 seconds

These were posted by Eugene Nalimov. I found them in the archives.

int LastOne(BITBOARD arg1) {
unsigned bias;

bias = 0;
if ((unsigned) arg1) {
arg1 = (unsigned) arg1;
bias = 32;
}
else
arg1 >>= 32;
if (arg1&65535) {
arg1 &= 65535;
bias += 16;
}
else
arg1 >>= 16;
if (arg1&255) {
arg1 &= 255;
bias += 8;
}
else
arg1 >>= 8;
return (last_ones_8bit[arg1]+bias);
}

int FirstOne(BITBOARD arg1) {
unsigned bias;

bias = 0;
if (arg1>>32)
arg1 >>= 32;
else
bias += 32;
if (arg1>>16)
arg1 >>= 16;
else
bias += 16;
if (arg1 >> 8)
arg1 >>= 8;
else
bias += 8;
return (first_ones_8bit[arg1]+bias);
}

Russell



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.