Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: MS compiler issue [OT]

Author: Gerd Isenberg

Date: 00:58:56 10/16/05

Go up one level in this thread


On October 16, 2005 at 01:33:08, Scott Gasch wrote:

>Again, with FASTCALL on the function.  This is MS C++ .net 2003 I think.  I left
>the comments in:
>
>; Listing generated by Microsoft (R) Optimizing Compiler Version 13.10.3077
>
>PUBLIC	@getDayIndex1March00@12
>; Function compile flags: /Ogty
>_TEXT	SEGMENT
>_year$ = 8						; size = 4
>@getDayIndex1March00@12 PROC NEAR
>; _day$ = ecx
>; _month$ = edx
>
>; 637  : {
>
>	push	ebx
>	push	esi
>
>; 638  :         static int daysTilMonth[12] =
>; 639  :         {
>; 640  :                 6*31 + 4*30, // jan
>; 641  :                 7*31 + 4*30, // feb
>; 642  :                 0*31 + 0*30, // mar
>; 643  :                 1*31 + 0*30, // apr
>; 644  :                 1*31 + 1*30, // may
>; 645  :                 2*31 + 1*30, // jun
>; 646  :                 2*31 + 2*30, // jul
>; 647  :                 3*31 + 2*30, // aug
>; 648  :                 4*31 + 2*30, // sep
>; 649  :                 4*31 + 3*30, // oct
>; 650  :                 5*31 + 3*30, // nov
>; 651  :                 5*31 + 4*30, // dec
>; 652  :         };
>; 653  :         unsigned int cent, didx;
>; 654  :         year -= (month < 3);
>
>	mov	esi, DWORD PTR _year$[esp+4]
>	push	edi
>	mov	edi, edx

Month is passed in edx, but edx is used later by the 32*32=64bit mul.
So the register edi is used to keep that param as later array index.
Unfortunately edi is none volatile and must be saved/restored on the stack !?

>	cmp	edi, 3
>	sbb	eax, eax
>	neg	eax
>	sub	esi, eax

This is really funny and really a one to one coding of
year -= (month < 3). At least it is branchless ;-)

The first optimization idea is to replace neg, sub by add:

	cmp	edi, 3
	sbb	eax, eax
	add	esi, eax

The second optimization, as already mentioned, subtracting carry(borrow) direct
from the year-register, like gcc does:

	cmp	edi, 3
	sbb	esi, 0

>
>; 655  :         cent  = year / 100;
>
>	mov	eax, 1374389535				; 51eb851fH
>	mul	esi
>
>; 656  :         didx  = year * 365 + (year>>2) - cent + (cent>>2)
>; 657  :                   + daysTilMonth[month-1] + day;
>
>	lea	eax, DWORD PTR [esi+esi*8]
>	lea	eax, DWORD PTR [esi+eax*8]
>	lea	ebx, DWORD PTR [eax+eax*4]


Guess the tree leas take the full 3*2 = 6 cycles.
For amd64 i suggest 3 cycles and less code:
         imul   ebx, esi, 365
Is mul still slower on Intel-cpus (P4, Centrino)?


>
>; 658  :         return didx;
>
>	mov	eax, DWORD PTR ?daysTilMonth@?1??getDayIndex1March00@@9@9[edi*4-4]
>	shr	edx, 5
>	add	eax, ebx
>	mov	edi, edx
>	shr	edi, 2
>	add	eax, edi
>	shr	esi, 2
>	add	eax, esi
>	pop	edi
>	sub	eax, edx
>	pop	esi
>	add	eax, ecx


Here finally the fastcall register ecx for "day" is considered.
Hmm - probably it would be smarter to change year and day in the parameter list.

>	pop	ebx
>
>; 659  : }
>
>	ret	4
>@getDayIndex1March00@12 ENDP
>_TEXT	ENDS
>END

Thanks,
Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.