Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Opteron Instruction Set

Author: Gerd Isenberg

Date: 07:10:44 02/03/04

Go up one level in this thread


On February 03, 2004 at 06:34:30, Omid David Tabibi wrote:

>I tested the following two codes:
>
>////////////////////////////////////////////////////
>
>unsigned char	lsbit[256];
>
>void init_bitscan() {
>
>	int i, j;
>
>	for (i = 0; i < 256; i++) {
>		for (j = 0; j < 8; j++) {
>			if (i & (1 << j)) {
>				lsbit[i] = j;
>				break;
>			}
>		}
>	}
>}
>
>__forceinline int findFirstBitTrue(UINT32 data) {
>
>	int result = 0;
>	if (!(data & 0xffff)) {
>		data >>= 16;
>		result += 16;
>	}
>	if (!(data & 0xff)) {
>		data >>= 8;
>		result += 8;
>	}
>	return result + lsbit[data & 0xff];
>}
>
>////////////////////////////////////////////////////
>
>__forceinline int findFirstBitTrue(UINT32 data) {
>	__asm   bsf		eax, dword ptr[data]
>};
>
>////////////////////////////////////////////////////
>
>I ran the benchmark on Falcon using these two implementations. The C version
>slowed down the engine by a little over 2%. Not as bad as I expected...


Omid, what did you expected?
E.g. what about the slowdown of your engine if you double codesize and runtime
of the bsf routine?

__forceinline int findFirstBitTrue(UINT32 data) {
	__asm   bsf		eax, dword ptr[data]
	__asm   bsf		eax, dword ptr[data]
}

I found no documentation so far about msc bsf-intrinsic for AMD64 and whether
there is a 32-bit version. Moving 32-bit to eax zero extends to 64bit in rax,
you maybe use even a 64-bit bsf for your purpose.

Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.