Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Nalimov: bsf/bsr intrinsics implementation still not optimal

Author: Dezhi Zhao

Date: 11:22:43 09/23/04

Go up one level in this thread


better, but still not. please refer to below



On September 23, 2004 at 13:17:08, Bo Persson wrote:

>On September 22, 2004 at 13:55:46, Dezhi Zhao wrote:
>
>>I'm  really happy that bit operation instructions have become intrinsics for VC
>>compiler in VS 2003 and later.
>>
>>However the output asm code is still not optimal. It generates a pair of
>>redundant memory-register save and load instrucions. I also tested VC 2005
>>Express beta1. The same thing again....
>>
>
>Have you tried the "Tools Refresh" for Beta 1?
>
>http://www.microsoft.com/downloads/details.aspx?FamilyID=afd04ff1-9d16-439a-9a5e-e13eb0341923&displaylang=en
>
>Here is the output of the september release:
>
>; 18   : 	while(mask)
>
>  00006	85 f6		 test	 esi, esi
>  00008	74 1a		 je	 SHORT $LN1@testbsf
>  0000a	8d 9b 00 00 00
>	00		 npad	 6
>$LL2@testbsf:
>
>; 19   : 	{
>; 20   : 		unsigned long index;
>; 21   : 		_BitScanForward(&index, mask);
>
>  00010	0f bc ce	 bsf	 ecx, esi
>
>; 22   : 		m1 = 1 << index;
>
>  00013	ba 01 00 00 00	 mov	 edx, 1
>  00018	d3 e2		 shl	 edx, cl

>  0001a	89 4c 24 04	 mov	 DWORD PTR _index$15120[esp+8], ecx

The above store is of no use at all.

>
>; 23   : 		clone |= m1;
>
>  0001e	0b c2		 or	 eax, edx
>
>; 24   : 		mask ^= m1;
>
>  00020	33 f2		 xor	 esi, edx
>  00022	75 ec		 jne	 SHORT $LL2@testbsf
>$LN1@testbsf:
>  00024	5e		 pop	 esi
>
>; 25   : 	}
>; 26   :
>; 27   : 	return clone;
>; 28   : }
>
>
>Already fixed!   :-)
>
>
>Bo Persson



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.