Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Assembly Programmers Challenge! (repost and clarification)

Author: Gerd Isenberg

Date: 02:32:43 01/22/03

Go up one level in this thread


On January 22, 2003 at 04:20:04, Matt Taylor wrote:

>On January 22, 2003 at 03:00:22, Gerd Isenberg wrote:
>
>>On January 22, 2003 at 02:54:44, Matt Taylor wrote:
>>
>>><snip>
>>>>__forceinline
>>>>UINT BitSearchAndReset(BitBoard &bb)
>>>>{
>>>>#ifdef	_M_IX86
>>>>#ifdef USE_SAVE_BSF
>>>>	__asm
>>>>	{
>>>>		xor		edx, edx
>>>>		mov		ebx, [bb]
>>>>		xor		eax, eax
>>>>		inc		edx
>>>>		bsf		ecx, [ebx]
>>>>		jnz		found
>>>>		bsf		ecx, [ebx + 4]
>>>>		lea		ebx, [ebx + 4]
>>>>		xor		eax, 32
>>>>	found:
>>>>		shl		edx, cl
>>>>		xor		eax, ecx
>>>>		xor		[ebx], edx
>>>>	}
>>>>#else
>>>>	__asm
>>>>	{
>>>>		mov		edx, [bb]
>>>>		bsf		eax, [edx+4]
>>>>		xor		eax, 32
>>>>		bsf		eax, [edx]
>>>>		btr		[edx],eax
>>>>	}
>>>>#endif
>>>>...
>>>>}
>>>>
>>>>Regards,
>>>>Gerd
>>>
>>>Remember, bsr/bsf/btr are -slow- on Athlon. That bit search & reset is at least
>>>16 clocks (bsf twice, likely ~5-6 clocks additional overhead at a glance). I
>>>guess I should give up on the last bitscan and post my results. Walter's routine
>>>optimizes to 11 clocks + 134 byte table.
>>>
>>>-Matt
>>
>>Hi Matt,
>>
>>Yes i know, tried a lot, but this inlined conditional version is fastest so far
>>in IsiChess - but not in dumb loop test. Remaining code size matters a lot and
>>it is used so often in my program...
>>
>>Cheers,
>>Gerd
>
>Why would code size matter?
>
>-Matt

If i use an inlined function let say 256 times, it is a matter if the functions
is 32Bytes or 128Bytes long, even if one 128Byte functions is faster, if already
in cache.

Gerd

Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.