Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Puzzling Assembler Question

Author: Russell Reagan

Date: 16:24:10 02/19/04

Go up one level in this thread


On February 19, 2004 at 18:16:17, Dann Corbit wrote:

>You might want to examine here and below for an alternative pure C model that
>seems to be just about the same speed:
>http://www.talkchess.com/forums/1/message.html?349781

I agree. The performance of the bsf/bsr instructions seems to depend greatly on
what brand of CPU you are using. For instance, on a PIII, it is about ten times
faster than it is on an Athlon. On the Athlon, the C versions are usually
faster. I don't know about the P4.

This is the one I usually use. Eugene Nalimov wrote it for the Itanium, but it
runs about as fast or faster than most of the other routines I've tried on
32-bit hardware. It requires one table lookup, but the table is only 256
elements so it is cache friendly. The table data is just the first bit of the
byte.

int EugeneBitscan (Bitboard arg) {
    int result = 0;

    if (arg > 0xFFFFFFFF) {
        arg >>= 32;
        result = 32;
    }

    if (arg > 0xFFFF) {
        arg >>= 16;
        result += 16;
    }

    if (arg > 0xFF) {
        arg >>= 8;
        result += 8;
    }

    return result + table8[arg];
}

Have a look here for some others that I compared. There are comparisons for both
32-bit and 64-bit hardware.

http://chessprogramming.org/cccsearch/ccc.php?art_id=333679



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.