Author: Gerd Isenberg
Date: 01:53:16 06/16/05
Go up one level in this thread
<snip> >>Eugene, >> >>excuse my ignorance - but i still have some problems to understand the calling >>convention of msc2005 for x86-64. For what are those 32++ bytes necessesary, >>"allocated" by the caller? > >http://msdn.microsoft.com/library/default.asp?url=/library/en-us/kmarch/hh/kmarch/64bitAMD_6ec00b51-bf75-41bf-8635-caa8653c8bd9.xml.asp > >Thanks, >Eugene > >>Even - as in the /O1 case the callee does not save any registers here, but >>pushes and pops them in a disjoint stack area? >> >>If saving registers, why not on the frame of the callee instead of the caller? >> >>?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC ; >>DeBruijnGenerator::searchDeBruijn >> sub rsp, 40 >> ; rsp+40 is return address >> mov QWORD PTR [rsp + 0], rbx >> mov rbx, rcx >>; Line 62 >> mov ecx, DWORD PTR [rcx+32] >> cmp ecx, 1 >> mov QWORD PTR [rsp+24], rdi >> jbe $LN5@searchDeBr >> mov QWORD PTR [rsp+8], rbp >> mov QWORD PTR [rsp+16], rsi >> ... >> Ok, i learned something about volatile and nonvolatile registers, frame and leaf functions and (dynamic with alloca) stack allocation. Stack Allocation: ------------------------------------------------------------------------ ... The parameter area is always at the bottom of the stack (even if the alloca function is used), so that the area is always adjacent to the return address during any function call. The area contains at least four entries but always enough space to hold all the parameters required by any function that might be called. Note that space is always allocated for the register parameters, even if the parameters themselves are never homed to the stack; a callee is guaranteed that space has been allocated for all its parameters. Home addresses are required for the register arguments so a contiguous area is available in case the called function must take the address of the argument list (va_list) or an individual argument. This area also provides a convenient place to save register arguments during thunk execution and as a debugging option (for example, the arguments are easy to find during debugging if they are stored at their home addresses in the prolog code). ------------------------------------------------------------------------ I fear the "overhead", introduced by msc2005 x86-64 calling conventions makes it a bit harder to beat 32-bit compiles in cases where deep recursions occur, specially with a lot of 32-bit parameters. While the "big waste" of moving zero or sign extended high dwords around usually don't occurs on x86-64 because 32-bit ops like mov DWORD PTR is still the default - this happens on the stack with scalar 32-bit ints - introducing higher memory bandwidth and cacheline usage (also due to unused stack-slots). So keeping alfa, beta, depth, ply etc. inside a struct or class and using explicite stacks of those structs might be a consideration ... Thanks, Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.