Author: Gerd Isenberg
Date: 02:32:43 01/22/03
Go up one level in this thread
On January 22, 2003 at 04:20:04, Matt Taylor wrote: >On January 22, 2003 at 03:00:22, Gerd Isenberg wrote: > >>On January 22, 2003 at 02:54:44, Matt Taylor wrote: >> >>><snip> >>>>__forceinline >>>>UINT BitSearchAndReset(BitBoard &bb) >>>>{ >>>>#ifdef _M_IX86 >>>>#ifdef USE_SAVE_BSF >>>> __asm >>>> { >>>> xor edx, edx >>>> mov ebx, [bb] >>>> xor eax, eax >>>> inc edx >>>> bsf ecx, [ebx] >>>> jnz found >>>> bsf ecx, [ebx + 4] >>>> lea ebx, [ebx + 4] >>>> xor eax, 32 >>>> found: >>>> shl edx, cl >>>> xor eax, ecx >>>> xor [ebx], edx >>>> } >>>>#else >>>> __asm >>>> { >>>> mov edx, [bb] >>>> bsf eax, [edx+4] >>>> xor eax, 32 >>>> bsf eax, [edx] >>>> btr [edx],eax >>>> } >>>>#endif >>>>... >>>>} >>>> >>>>Regards, >>>>Gerd >>> >>>Remember, bsr/bsf/btr are -slow- on Athlon. That bit search & reset is at least >>>16 clocks (bsf twice, likely ~5-6 clocks additional overhead at a glance). I >>>guess I should give up on the last bitscan and post my results. Walter's routine >>>optimizes to 11 clocks + 134 byte table. >>> >>>-Matt >> >>Hi Matt, >> >>Yes i know, tried a lot, but this inlined conditional version is fastest so far >>in IsiChess - but not in dumb loop test. Remaining code size matters a lot and >>it is used so often in my program... >> >>Cheers, >>Gerd > >Why would code size matter? > >-Matt If i use an inlined function let say 256 times, it is a matter if the functions is 32Bytes or 128Bytes long, even if one 128Byte functions is faster, if already in cache. Gerd Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.