Author: Matt Taylor
Date: 23:54:44 01/21/03
Go up one level in this thread
<snip>
>__forceinline
>UINT BitSearchAndReset(BitBoard &bb)
>{
>#ifdef _M_IX86
>#ifdef USE_SAVE_BSF
> __asm
> {
> xor edx, edx
> mov ebx, [bb]
> xor eax, eax
> inc edx
> bsf ecx, [ebx]
> jnz found
> bsf ecx, [ebx + 4]
> lea ebx, [ebx + 4]
> xor eax, 32
> found:
> shl edx, cl
> xor eax, ecx
> xor [ebx], edx
> }
>#else
> __asm
> {
> mov edx, [bb]
> bsf eax, [edx+4]
> xor eax, 32
> bsf eax, [edx]
> btr [edx],eax
> }
>#endif
>...
>}
>
>Regards,
>Gerd
Remember, bsr/bsf/btr are -slow- on Athlon. That bit search & reset is at least
16 clocks (bsf twice, likely ~5-6 clocks additional overhead at a glance). I
guess I should give up on the last bitscan and post my results. Walter's routine
optimizes to 11 clocks + 134 byte table.
-Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.