Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: odd msc 2005 behaviour - 64-bit mode

Author: Eugene Nalimov

Date: 20:38:27 06/13/05

Go up one level in this thread


On June 13, 2005 at 14:14:09, Gerd Isenberg wrote:

>On June 13, 2005 at 13:45:16, Eugene Nalimov wrote:
>
>>On June 13, 2005 at 13:23:53, Gerd Isenberg wrote:
>>
>>>hi, compiler experts!
>>>
>>>Inside a recursive search routine (not alfa/beta but my fruit fly ;-) with only
>>>this-pointer and one additional integer parameter and local, msc2005 wastes 40
>>>bytes (72 with other optimizations) stackspace each call. A new stack
>>>defragmentation trick by ms? For 8-byte alignment those paddings seems a bit to
>>>huge. Each call eats one cacheline.
>>>Can someone please explain what's going on here ;-)
>>
>>Calling conventions. You should reserve (I believe) 32 bytes on stack for
>>function you are calling. Extra 8 bytes are because stack should be 16-bytes
>>align, but on function entry it is 8 bytes aligned, and we are saving even
>>number of registers.
>
>I see - usually we have some more variables on the stack - so the waste becomes
>relative smaller if not zero.
>
>Otoh there are 3 register parameters as well as a lot of remaining registers.
>A recursive, very compact qsearch ...

You compiled your function optimized for size (/O1), and because of that
compiler decided to use very short PUSH/POP instructions to save/restore
registers, even though it results in some unused slots on stack. If you compile
your program optimizing for speed (/O2 or /Ox), compiler will use MOV
instructions, and it will save registers into empty stack slots provided by
caller:

; Listing generated by Microsoft (R) Optimizing Compiler Version 14.00.50317
...

?searchDeBruijn@DeBruijnGenerator@@QEAAXI@Z PROC	;
DeBruijnGenerator::searchDeBruijn
	sub	rsp, 40					; 00000028H
	mov	QWORD PTR [rsp+48], rbx
	mov	rbx, rcx
; Line 62
	mov	ecx, DWORD PTR [rcx+32]
	cmp	ecx, 1
	mov	QWORD PTR [rsp+72], rdi
	jbe	$LN5@searchDeBr
	mov	QWORD PTR [rsp+56], rbp
	mov	QWORD PTR [rsp+64], rsi
	...

Is that what you want?

(You can also see that compiler "shrink wrapped" save/restore for some
registers, i.e. RBP and RSI would be saved/restored only if they are used in the
function).

>Well may be an iterative approach for alfa/beta pays even more off.
>
>>
>>>I also wonder whether it is not possible for the compiler to keep the class
>>>members inside registers during the recursive search - dumb compiler ;-)
>>
>>We were thinking about such optimization, but had to prune it due to some more
>>urgent needs. In any case you have indirect call in your function, so the
>>optimiziation would not fire even were it implemented.
>
>The virtual const might be a hint.

We are not using types for memory disambiguation (alias analysis). That is
conscious decision. We know that by using type information we can improve
quality of generated code. Unfortunately, by doing so we will also break lot of
existing code. Yes, that code is not standard compliant, but it always compiled
and worked, it can be 20 years old, such bugs are very hard to trace, and we
don't want our customers complain "typical MS product -- buggy compiler broke my
code".

Thanks,
Eugene

>Thanks,
>Gerd
>
>>
>>Thanks,
>>Eugene



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.