Author: Gerd Isenberg
Date: 06:38:14 06/15/05
Go up one level in this thread
On June 13, 2005 at 23:38:27, Eugene Nalimov wrote:
>On June 13, 2005 at 14:14:09, Gerd Isenberg wrote:
>
>>On June 13, 2005 at 13:45:16, Eugene Nalimov wrote:
>>
>>>On June 13, 2005 at 13:23:53, Gerd Isenberg wrote:
>>>
>>>>hi, compiler experts!
>>>>
>>>>Inside a recursive search routine (not alfa/beta but my fruit fly ;-) with only
>>>>this-pointer and one additional integer parameter and local, msc2005 wastes 40
>>>>bytes (72 with other optimizations) stackspace each call. A new stack
>>>>defragmentation trick by ms? For 8-byte alignment those paddings seems a bit to
>>>>huge. Each call eats one cacheline.
>>>>Can someone please explain what's going on here ;-)
>>>
>>>Calling conventions. You should reserve (I believe) 32 bytes on stack for
>>>function you are calling. Extra 8 bytes are because stack should be 16-bytes
>>>align, but on function entry it is 8 bytes aligned, and we are saving even
>>>number of registers.
>>
>>I see - usually we have some more variables on the stack - so the waste becomes
>>relative smaller if not zero.
>>
>>Otoh there are 3 register parameters as well as a lot of remaining registers.
>>A recursive, very compact qsearch ...
>
>You compiled your function optimized for size (/O1), and because of that
>compiler decided to use very short PUSH/POP instructions to save/restore
>registers, even though it results in some unused slots on stack.
>If you compile
>your program optimizing for speed (/O2 or /Ox), compiler will use MOV
>instructions, and it will save registers into empty stack slots provided by
>caller:
>
>; Listing generated by Microsoft (R) Optimizing Compiler Version 14.00.50317
>...
>
>?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC ;
>DeBruijnGenerator::searchDeBruijn
> sub rsp, 40 ; 00000028H
> mov QWORD PTR [rsp+48], rbx
> mov rbx, rcx
>; Line 62
> mov ecx, DWORD PTR [rcx+32]
> cmp ecx, 1
> mov QWORD PTR [rsp+72], rdi
> jbe $LN5@searchDeBr
> mov QWORD PTR [rsp+56], rbp
> mov QWORD PTR [rsp+64], rsi
> ...
>
>Is that what you want?
Eugene,
excuse my ignorance - but i still have some problems to understand the calling
convention of msc2005 for x86-64. For what are those 32++ bytes necessesary,
"allocated" by the caller?
Even - as in the /O1 case the callee does not save any registers here, but
pushes and pops them in a disjoint stack area?
If saving registers, why not on the frame of the callee instead of the caller?
?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC ;
DeBruijnGenerator::searchDeBruijn
sub rsp, 40
; rsp+40 is return address
mov QWORD PTR [rsp + 0], rbx
mov rbx, rcx
; Line 62
mov ecx, DWORD PTR [rcx+32]
cmp ecx, 1
mov QWORD PTR [rsp+24], rdi
jbe $LN5@searchDeBr
mov QWORD PTR [rsp+8], rbp
mov QWORD PTR [rsp+16], rsi
...
Thanks,
Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.