Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Nalimov: bsf/bsr intrinsics implementation still not optimal

Author: Dezhi Zhao

Date: 07:44:09 09/23/04

Go up one level in this thread


On September 23, 2004 at 09:37:46, Gerd Isenberg wrote:

>On September 23, 2004 at 08:31:33, Dezhi Zhao wrote:
>
>>On September 23, 2004 at 06:12:48, Gerd Isenberg wrote:
>>
>>>On September 22, 2004 at 13:55:46, Dezhi Zhao wrote:
>>>
>>>>I'm  really happy that bit operation instructions have become intrinsics for VC
>>>>compiler in VS 2003 and later.
>>>>
>>>>However the output asm code is still not optimal. It generates a pair of
>>>>redundant memory-register save and load instrucions. I also tested VC 2005
>>>>Express beta1. The same thing again....
>>>
>>>Yes, such not neccesary store/loads seem a problem with
>>>msc-intrinsics:
>>>
>>>http://chessprogramming.org/cccsearch/ccc.php?art_id=357561
>>>http://chessprogramming.org/cccsearch/ccc.php?art_id=357568
>>>http://chessprogramming.org/cccsearch/ccc.php?art_id=357570
>>>
>>>But in your case, based on the prototype of the instrinsic, i guess a boolean
>>>return, but a pointer to index, the bsf-intrinsic is foreced to store the found
>>>bitindex to memory.
>>>
>>>Gerd
>>>
>>
>>You're probally right that the prototype fools the compiler. However I think the
>>complier still has good oppotunities to eliminate those store/load instructions
>>when it comes to optimize the code.
>
>I think that too, but that is probably not so easy, considering alias pointer to
>the same address.

If the pointer variable is a global one, it would not be easy.
But as you see I declare it local.....

>
>>Now look at the prototype again:
>>
>>unsigned char _BitScanForward(unsigned long* index, unsigned long mask);
>>
>>It returns a boolean value. This is to handle the case where mask == 0.
>>I don't think BitScanForward needs such exception handler. I like this better:
>>
>>long _BitScanForward(long mask);
>>
>>But you need to add a warning in the documentation that the return value is
>>undefined if mask == 0 just as the one you can find in bsf/bsr documentation.
>
>Yes, i like the none pointer prototype much better too.
>
>But leaving an undefined return value for a C-instrinsic is probably too
>dangerous - even if it is most often used in this while(mask) or if(mask)
>context. I suggest returning an negative (or > 31,63) value for mask == 0.
>

Not dangerous at all. Probally almost all programers who care about to use those
intrinsics are asm guys already. They know what they are doing. As you said, we
test the mask _before_ the scan operation.


>An additional jnc after bsf/bsr is not so expensive, if always not taken,
>specially compared to the huge latency of bsf/bsr. But it leaves the option to
>build the traverse loop without an explicit != zero test.
>

That's exactly why you should do the test before bit scan. Should the return
value be used as you described,  the prototype would be blamed encouraging
sub-optimal programming:)

>>
>>I'm sure if it were of the simplier proptype, the current compiler would not
>>generate the redundant store/load instructions.
>>
>>Zhao
>
>Yes.
>
>Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.