Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: odd msc 2005 behaviour - 64-bit mode

Author: Eugene Nalimov

Date: 07:58:50 06/15/05

Go up one level in this thread


On June 15, 2005 at 09:38:14, Gerd Isenberg wrote:

>On June 13, 2005 at 23:38:27, Eugene Nalimov wrote:
>
>>On June 13, 2005 at 14:14:09, Gerd Isenberg wrote:
>>
>>>On June 13, 2005 at 13:45:16, Eugene Nalimov wrote:
>>>
>>>>On June 13, 2005 at 13:23:53, Gerd Isenberg wrote:
>>>>
>>>>>hi, compiler experts!
>>>>>
>>>>>Inside a recursive search routine (not alfa/beta but my fruit fly ;-) with only
>>>>>this-pointer and one additional integer parameter and local, msc2005 wastes 40
>>>>>bytes (72 with other optimizations) stackspace each call. A new stack
>>>>>defragmentation trick by ms? For 8-byte alignment those paddings seems a bit to
>>>>>huge. Each call eats one cacheline.
>>>>>Can someone please explain what's going on here ;-)
>>>>
>>>>Calling conventions. You should reserve (I believe) 32 bytes on stack for
>>>>function you are calling. Extra 8 bytes are because stack should be 16-bytes
>>>>align, but on function entry it is 8 bytes aligned, and we are saving even
>>>>number of registers.
>>>
>>>I see - usually we have some more variables on the stack - so the waste becomes
>>>relative smaller if not zero.
>>>
>>>Otoh there are 3 register parameters as well as a lot of remaining registers.
>>>A recursive, very compact qsearch ...
>>
>>You compiled your function optimized for size (/O1), and because of that
>>compiler decided to use very short PUSH/POP instructions to save/restore
>>registers, even though it results in some unused slots on stack.
>>If you compile
>>your program optimizing for speed (/O2 or /Ox), compiler will use MOV
>>instructions, and it will save registers into empty stack slots provided by
>>caller:
>>
>>; Listing generated by Microsoft (R) Optimizing Compiler Version 14.00.50317
>>...
>>
>>?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC	;
>>DeBruijnGenerator::searchDeBruijn
>>	sub	rsp, 40					; 00000028H
>>	mov	QWORD PTR [rsp+48], rbx
>>	mov	rbx, rcx
>>; Line 62
>>	mov	ecx, DWORD PTR [rcx+32]
>>	cmp	ecx, 1
>>	mov	QWORD PTR [rsp+72], rdi
>>	jbe	$LN5@searchDeBr
>>	mov	QWORD PTR [rsp+56], rbp
>>	mov	QWORD PTR [rsp+64], rsi
>>	...
>>
>>Is that what you want?
>
>
>Eugene,
>
>excuse my ignorance - but i still have some problems to understand the calling
>convention of msc2005 for x86-64. For what are those 32++ bytes necessesary,
>"allocated" by the caller?

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/kmarch/hh/kmarch/64bitAMD_6ec00b51-bf75-41bf-8635-caa8653c8bd9.xml.asp

Thanks,
Eugene

>Even - as in the /O1 case the callee does not save any registers here, but
>pushes and pops them in a disjoint stack area?
>
>If saving registers, why not on the frame of the callee instead of the caller?
>
>?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC	;
>DeBruijnGenerator::searchDeBruijn
>	sub	rsp, 40
>        ; rsp+40 is return address
>	mov	QWORD PTR [rsp + 0], rbx
>	mov	rbx, rcx
>; Line 62
>	mov	ecx, DWORD PTR [rcx+32]
>	cmp	ecx, 1
>	mov	QWORD PTR [rsp+24], rdi
>	jbe	$LN5@searchDeBr
>	mov	QWORD PTR [rsp+8], rbp
>	mov	QWORD PTR [rsp+16], rsi
>	...
>
>
>Thanks,
>Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.