Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: odd msc 2005 behaviour - 64-bit mode

Author: Gerd Isenberg

Date: 01:53:16 06/16/05

Go up one level in this thread


<snip>
>>Eugene,
>>
>>excuse my ignorance - but i still have some problems to understand the calling
>>convention of msc2005 for x86-64. For what are those 32++ bytes necessesary,
>>"allocated" by the caller?
>
>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/kmarch/hh/kmarch/64bitAMD_6ec00b51-bf75-41bf-8635-caa8653c8bd9.xml.asp
>
>Thanks,
>Eugene
>
>>Even - as in the /O1 case the callee does not save any registers here, but
>>pushes and pops them in a disjoint stack area?
>>
>>If saving registers, why not on the frame of the callee instead of the caller?
>>
>>?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC	;
>>DeBruijnGenerator::searchDeBruijn
>>	sub	rsp, 40
>>        ; rsp+40 is return address
>>	mov	QWORD PTR [rsp + 0], rbx
>>	mov	rbx, rcx
>>; Line 62
>>	mov	ecx, DWORD PTR [rcx+32]
>>	cmp	ecx, 1
>>	mov	QWORD PTR [rsp+24], rdi
>>	jbe	$LN5@searchDeBr
>>	mov	QWORD PTR [rsp+8], rbp
>>	mov	QWORD PTR [rsp+16], rsi
>>	...
>>

Ok, i learned something about volatile and nonvolatile registers, frame and leaf
functions and (dynamic with alloca) stack allocation.

Stack Allocation:
------------------------------------------------------------------------
...

The parameter area is always at the bottom of the stack (even if the alloca
function is used), so that the area is always adjacent to the return address
during any function call. The area contains at least four entries but always
enough space to hold all the parameters required by any function that might be
called. Note that space is always allocated for the register parameters, even if
the parameters themselves are never homed to the stack; a callee is guaranteed
that space has been allocated for all its parameters. Home addresses are
required for the register arguments so a contiguous area is available in case
the called function must take the address of the argument list (va_list) or an
individual argument. This area also provides a convenient place to save register
arguments during thunk execution and as a debugging option (for example, the
arguments are easy to find during debugging if they are stored at their home
addresses in the prolog code).
------------------------------------------------------------------------

I fear the "overhead", introduced by msc2005 x86-64 calling conventions makes it
a bit harder to beat 32-bit compiles in cases where deep recursions occur,
specially with a lot of 32-bit parameters.

While the "big waste" of moving zero or sign extended high dwords around usually
don't occurs on x86-64 because 32-bit ops like mov DWORD PTR is still the
default - this happens on the stack with scalar 32-bit ints - introducing higher
memory bandwidth and cacheline usage (also due to unused stack-slots).

So keeping alfa, beta, depth, ply etc. inside a struct or class and using
explicite stacks of those structs might be a consideration ...

Thanks,
Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.