Author: Eugene Nalimov
Date: 07:58:50 06/15/05
Go up one level in this thread
On June 15, 2005 at 09:38:14, Gerd Isenberg wrote: >On June 13, 2005 at 23:38:27, Eugene Nalimov wrote: > >>On June 13, 2005 at 14:14:09, Gerd Isenberg wrote: >> >>>On June 13, 2005 at 13:45:16, Eugene Nalimov wrote: >>> >>>>On June 13, 2005 at 13:23:53, Gerd Isenberg wrote: >>>> >>>>>hi, compiler experts! >>>>> >>>>>Inside a recursive search routine (not alfa/beta but my fruit fly ;-) with only >>>>>this-pointer and one additional integer parameter and local, msc2005 wastes 40 >>>>>bytes (72 with other optimizations) stackspace each call. A new stack >>>>>defragmentation trick by ms? For 8-byte alignment those paddings seems a bit to >>>>>huge. Each call eats one cacheline. >>>>>Can someone please explain what's going on here ;-) >>>> >>>>Calling conventions. You should reserve (I believe) 32 bytes on stack for >>>>function you are calling. Extra 8 bytes are because stack should be 16-bytes >>>>align, but on function entry it is 8 bytes aligned, and we are saving even >>>>number of registers. >>> >>>I see - usually we have some more variables on the stack - so the waste becomes >>>relative smaller if not zero. >>> >>>Otoh there are 3 register parameters as well as a lot of remaining registers. >>>A recursive, very compact qsearch ... >> >>You compiled your function optimized for size (/O1), and because of that >>compiler decided to use very short PUSH/POP instructions to save/restore >>registers, even though it results in some unused slots on stack. >>If you compile >>your program optimizing for speed (/O2 or /Ox), compiler will use MOV >>instructions, and it will save registers into empty stack slots provided by >>caller: >> >>; Listing generated by Microsoft (R) Optimizing Compiler Version 14.00.50317 >>... >> >>?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC ; >>DeBruijnGenerator::searchDeBruijn >> sub rsp, 40 ; 00000028H >> mov QWORD PTR [rsp+48], rbx >> mov rbx, rcx >>; Line 62 >> mov ecx, DWORD PTR [rcx+32] >> cmp ecx, 1 >> mov QWORD PTR [rsp+72], rdi >> jbe $LN5@searchDeBr >> mov QWORD PTR [rsp+56], rbp >> mov QWORD PTR [rsp+64], rsi >> ... >> >>Is that what you want? > > >Eugene, > >excuse my ignorance - but i still have some problems to understand the calling >convention of msc2005 for x86-64. For what are those 32++ bytes necessesary, >"allocated" by the caller? http://msdn.microsoft.com/library/default.asp?url=/library/en-us/kmarch/hh/kmarch/64bitAMD_6ec00b51-bf75-41bf-8635-caa8653c8bd9.xml.asp Thanks, Eugene >Even - as in the /O1 case the callee does not save any registers here, but >pushes and pops them in a disjoint stack area? > >If saving registers, why not on the frame of the callee instead of the caller? > >?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC ; >DeBruijnGenerator::searchDeBruijn > sub rsp, 40 > ; rsp+40 is return address > mov QWORD PTR [rsp + 0], rbx > mov rbx, rcx >; Line 62 > mov ecx, DWORD PTR [rcx+32] > cmp ecx, 1 > mov QWORD PTR [rsp+24], rdi > jbe $LN5@searchDeBr > mov QWORD PTR [rsp+8], rbp > mov QWORD PTR [rsp+16], rsi > ... > > >Thanks, >Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.