Author: Gerd Isenberg
Date: 07:10:44 02/03/04
Go up one level in this thread
On February 03, 2004 at 06:34:30, Omid David Tabibi wrote:
>I tested the following two codes:
>
>////////////////////////////////////////////////////
>
>unsigned char lsbit[256];
>
>void init_bitscan() {
>
> int i, j;
>
> for (i = 0; i < 256; i++) {
> for (j = 0; j < 8; j++) {
> if (i & (1 << j)) {
> lsbit[i] = j;
> break;
> }
> }
> }
>}
>
>__forceinline int findFirstBitTrue(UINT32 data) {
>
> int result = 0;
> if (!(data & 0xffff)) {
> data >>= 16;
> result += 16;
> }
> if (!(data & 0xff)) {
> data >>= 8;
> result += 8;
> }
> return result + lsbit[data & 0xff];
>}
>
>////////////////////////////////////////////////////
>
>__forceinline int findFirstBitTrue(UINT32 data) {
> __asm bsf eax, dword ptr[data]
>};
>
>////////////////////////////////////////////////////
>
>I ran the benchmark on Falcon using these two implementations. The C version
>slowed down the engine by a little over 2%. Not as bad as I expected...
Omid, what did you expected?
E.g. what about the slowdown of your engine if you double codesize and runtime
of the bsf routine?
__forceinline int findFirstBitTrue(UINT32 data) {
__asm bsf eax, dword ptr[data]
__asm bsf eax, dword ptr[data]
}
I found no documentation so far about msc bsf-intrinsic for AMD64 and whether
there is a 32-bit version. Moving 32-bit to eax zero extends to 64bit in rax,
you maybe use even a 64-bit bsf for your purpose.
Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.