Author: Gerd Isenberg
Date: 02:32:43 01/22/03
Go up one level in this thread
On January 22, 2003 at 04:20:04, Matt Taylor wrote:
>On January 22, 2003 at 03:00:22, Gerd Isenberg wrote:
>
>>On January 22, 2003 at 02:54:44, Matt Taylor wrote:
>>
>>><snip>
>>>>__forceinline
>>>>UINT BitSearchAndReset(BitBoard &bb)
>>>>{
>>>>#ifdef _M_IX86
>>>>#ifdef USE_SAVE_BSF
>>>> __asm
>>>> {
>>>> xor edx, edx
>>>> mov ebx, [bb]
>>>> xor eax, eax
>>>> inc edx
>>>> bsf ecx, [ebx]
>>>> jnz found
>>>> bsf ecx, [ebx + 4]
>>>> lea ebx, [ebx + 4]
>>>> xor eax, 32
>>>> found:
>>>> shl edx, cl
>>>> xor eax, ecx
>>>> xor [ebx], edx
>>>> }
>>>>#else
>>>> __asm
>>>> {
>>>> mov edx, [bb]
>>>> bsf eax, [edx+4]
>>>> xor eax, 32
>>>> bsf eax, [edx]
>>>> btr [edx],eax
>>>> }
>>>>#endif
>>>>...
>>>>}
>>>>
>>>>Regards,
>>>>Gerd
>>>
>>>Remember, bsr/bsf/btr are -slow- on Athlon. That bit search & reset is at least
>>>16 clocks (bsf twice, likely ~5-6 clocks additional overhead at a glance). I
>>>guess I should give up on the last bitscan and post my results. Walter's routine
>>>optimizes to 11 clocks + 134 byte table.
>>>
>>>-Matt
>>
>>Hi Matt,
>>
>>Yes i know, tried a lot, but this inlined conditional version is fastest so far
>>in IsiChess - but not in dumb loop test. Remaining code size matters a lot and
>>it is used so often in my program...
>>
>>Cheers,
>>Gerd
>
>Why would code size matter?
>
>-Matt
If i use an inlined function let say 256 times, it is a matter if the functions
is 32Bytes or 128Bytes long, even if one 128Byte functions is faster, if already
in cache.
Gerd
Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.