Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Move generation question for the big boys

Author: Vincent Diepeveen

Date: 00:59:11 09/17/01

Go up one level in this thread


On September 16, 2001 at 22:44:36, Robert Hyatt wrote:

>On September 16, 2001 at 16:18:01, Vincent Diepeveen wrote:
>
>>On September 16, 2001 at 09:25:45, Robert Hyatt wrote:
>>
>>>On September 15, 2001 at 22:31:04, Vincent Diepeveen wrote:
>>>
>>>>On September 15, 2001 at 20:34:36, Robert Hyatt wrote:
>>>>
>>>>>On September 15, 2001 at 14:30:40, Vincent Diepeveen wrote:
>>>>
>>>>[snip]
>>>>>FirstOne() isn't particularly slow, using the bit-scan instructions that
>>>>>are very fast on PII and beyond.  It certainly isn't much of a cost in the
>>>>>profile runs I do.
>>>>
>>>>FORCEINLINE int FirstOne(BITBOARD a) {
>>>>
>>>>#if _M_IX86 <= 500 /* on plain Pentiums, use boolean.c algorithm */
>>>>  __asm {
>>>>        movzx   edx, word ptr a+6
>>>>        xor     eax, eax
>>>>        test    edx, edx
>>>>        jnz     l1
>>>>        mov     dx, word ptr a+4
>>>>        mov     eax, 16
>>>>        test    edx, edx
>>>>        jnz     l1
>>>>        mov     dx, word ptr a+2
>>>>        mov     eax, 32
>>>>        test    edx, edx
>>>>        jnz     l1
>>>>        mov     dx, word ptr a
>>>>        mov     eax, 48
>>>>  l1:   add     al, byte ptr first_ones[edx]
>>>>  }
>>>>#else /* BSF and BSR are *fast* instructions on PPro/PII */
>>>>  __asm {
>>>>        bsr     edx, dword ptr a+4
>>>>        mov     eax, 31
>>>>        jnz     l1
>>>>        bsr     edx, dword ptr a
>>>>        mov     eax, 63
>>>>        jnz     l1
>>>>        mov     edx, -1
>>>>  l1:   sub     eax, edx
>>>>  }
>>>>#endif /* _M_IX86 > 500 */
>>>>}
>>>>
>>>>Ugh ugh, how slow is 'particularly slow' slow in your dictionary?
>>>>
>>>>Best regards,
>>>>Vincent
>>>
>>>A few clock cycles is how slow this is.  Fast enough that it doesn't register
>>>on the profile runs to any significant degree...  If profile doesn't complein,
>>>then I don't worry about it.
>>
>>I find this function *dead* slow.
>>
>>and it's not a 'few' clocks. it's tens of clocks!
>
>
>It is not many "tens of clocks" or it would be showing up on my profile
>runs.  It doesn't.

Of course it is tens of clocks. See P3, P2 manuals. Not to mention
the bigger penalties the P4 and K7 show for branch mispredictions!

This is very easy to see. The statement that bsf/bsr are fast instructions
is irrelevant because either of the conditional jumps receives 30 times the
speed penalty of those instructions.





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.