Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: odd msc 2005 behaviour - 64-bit mode

Author: Gerd Isenberg

Date: 06:38:14 06/15/05

Go up one level in this thread


On June 13, 2005 at 23:38:27, Eugene Nalimov wrote:

>On June 13, 2005 at 14:14:09, Gerd Isenberg wrote:
>
>>On June 13, 2005 at 13:45:16, Eugene Nalimov wrote:
>>
>>>On June 13, 2005 at 13:23:53, Gerd Isenberg wrote:
>>>
>>>>hi, compiler experts!
>>>>
>>>>Inside a recursive search routine (not alfa/beta but my fruit fly ;-) with only
>>>>this-pointer and one additional integer parameter and local, msc2005 wastes 40
>>>>bytes (72 with other optimizations) stackspace each call. A new stack
>>>>defragmentation trick by ms? For 8-byte alignment those paddings seems a bit to
>>>>huge. Each call eats one cacheline.
>>>>Can someone please explain what's going on here ;-)
>>>
>>>Calling conventions. You should reserve (I believe) 32 bytes on stack for
>>>function you are calling. Extra 8 bytes are because stack should be 16-bytes
>>>align, but on function entry it is 8 bytes aligned, and we are saving even
>>>number of registers.
>>
>>I see - usually we have some more variables on the stack - so the waste becomes
>>relative smaller if not zero.
>>
>>Otoh there are 3 register parameters as well as a lot of remaining registers.
>>A recursive, very compact qsearch ...
>
>You compiled your function optimized for size (/O1), and because of that
>compiler decided to use very short PUSH/POP instructions to save/restore
>registers, even though it results in some unused slots on stack.
>If you compile
>your program optimizing for speed (/O2 or /Ox), compiler will use MOV
>instructions, and it will save registers into empty stack slots provided by
>caller:
>
>; Listing generated by Microsoft (R) Optimizing Compiler Version 14.00.50317
>...
>
>?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC	;
>DeBruijnGenerator::searchDeBruijn
>	sub	rsp, 40					; 00000028H
>	mov	QWORD PTR [rsp+48], rbx
>	mov	rbx, rcx
>; Line 62
>	mov	ecx, DWORD PTR [rcx+32]
>	cmp	ecx, 1
>	mov	QWORD PTR [rsp+72], rdi
>	jbe	$LN5@searchDeBr
>	mov	QWORD PTR [rsp+56], rbp
>	mov	QWORD PTR [rsp+64], rsi
>	...
>
>Is that what you want?


Eugene,

excuse my ignorance - but i still have some problems to understand the calling
convention of msc2005 for x86-64. For what are those 32++ bytes necessesary,
"allocated" by the caller?

Even - as in the /O1 case the callee does not save any registers here, but
pushes and pops them in a disjoint stack area?

If saving registers, why not on the frame of the callee instead of the caller?

?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC	;
DeBruijnGenerator::searchDeBruijn
	sub	rsp, 40
        ; rsp+40 is return address
	mov	QWORD PTR [rsp + 0], rbx
	mov	rbx, rcx
; Line 62
	mov	ecx, DWORD PTR [rcx+32]
	cmp	ecx, 1
	mov	QWORD PTR [rsp+24], rdi
	jbe	$LN5@searchDeBr
	mov	QWORD PTR [rsp+8], rbp
	mov	QWORD PTR [rsp+16], rsi
	...


Thanks,
Gerd




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.