Author: Vincent Diepeveen
Date: 00:59:11 09/17/01
Go up one level in this thread
On September 16, 2001 at 22:44:36, Robert Hyatt wrote: >On September 16, 2001 at 16:18:01, Vincent Diepeveen wrote: > >>On September 16, 2001 at 09:25:45, Robert Hyatt wrote: >> >>>On September 15, 2001 at 22:31:04, Vincent Diepeveen wrote: >>> >>>>On September 15, 2001 at 20:34:36, Robert Hyatt wrote: >>>> >>>>>On September 15, 2001 at 14:30:40, Vincent Diepeveen wrote: >>>> >>>>[snip] >>>>>FirstOne() isn't particularly slow, using the bit-scan instructions that >>>>>are very fast on PII and beyond. It certainly isn't much of a cost in the >>>>>profile runs I do. >>>> >>>>FORCEINLINE int FirstOne(BITBOARD a) { >>>> >>>>#if _M_IX86 <= 500 /* on plain Pentiums, use boolean.c algorithm */ >>>> __asm { >>>> movzx edx, word ptr a+6 >>>> xor eax, eax >>>> test edx, edx >>>> jnz l1 >>>> mov dx, word ptr a+4 >>>> mov eax, 16 >>>> test edx, edx >>>> jnz l1 >>>> mov dx, word ptr a+2 >>>> mov eax, 32 >>>> test edx, edx >>>> jnz l1 >>>> mov dx, word ptr a >>>> mov eax, 48 >>>> l1: add al, byte ptr first_ones[edx] >>>> } >>>>#else /* BSF and BSR are *fast* instructions on PPro/PII */ >>>> __asm { >>>> bsr edx, dword ptr a+4 >>>> mov eax, 31 >>>> jnz l1 >>>> bsr edx, dword ptr a >>>> mov eax, 63 >>>> jnz l1 >>>> mov edx, -1 >>>> l1: sub eax, edx >>>> } >>>>#endif /* _M_IX86 > 500 */ >>>>} >>>> >>>>Ugh ugh, how slow is 'particularly slow' slow in your dictionary? >>>> >>>>Best regards, >>>>Vincent >>> >>>A few clock cycles is how slow this is. Fast enough that it doesn't register >>>on the profile runs to any significant degree... If profile doesn't complein, >>>then I don't worry about it. >> >>I find this function *dead* slow. >> >>and it's not a 'few' clocks. it's tens of clocks! > > >It is not many "tens of clocks" or it would be showing up on my profile >runs. It doesn't. Of course it is tens of clocks. See P3, P2 manuals. Not to mention the bigger penalties the P4 and K7 show for branch mispredictions! This is very easy to see. The statement that bsf/bsr are fast instructions is irrelevant because either of the conditional jumps receives 30 times the speed penalty of those instructions.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.