Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Fruit 1.0 64-bit

Author: Eugene Nalimov

Date: 09:29:42 03/28/04

Go up one level in this thread


On March 28, 2004 at 11:58:46, Robert Hyatt wrote:

>On March 27, 2004 at 20:12:27, Slater Wold wrote:
>
>>On March 27, 2004 at 17:11:31, Robert Hyatt wrote:
>>
>>>On March 26, 2004 at 16:09:31, Slater Wold wrote:
>>>
>>>>On March 26, 2004 at 12:09:26, Russell Reagan wrote:
>>>>
>>>>>On March 26, 2004 at 09:15:12, Fabien Letouzey wrote:
>>>>>
>>>>>>Would there be a reason why, for instance, 16-bit integers would be slower in
>>>>>>64-bit mode?
>>>>>
>>>>>Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™
>>>>>
>>>>>2.22 Using Unsigned Integers for 32-Bit Array Indices
>>>>>
>>>>>Optimization
>>>>>When using a 32-bit variable as an array index, declare the variable as an
>>>>>unsigned integer instead of a signed integer.
>>>>>
>>>>>Application
>>>>>This optimization applies to 64-bit software.
>>>>>
>>>>>Rationale
>>>>>When performing 64-bit address arithmetic, the compiler must insert an
>>>>>additional instruction into the object code to sign-extend a signed 32-bit
>>>>>integer, which reduces performance; no additional instruction is necessary to
>>>>>zero-extend an unsigned 32-bit integer because the processor performs
>>>>>zero-extension automatically. (There is no performance penalty for using 64-bit
>>>>>variables—either signed or unsigned—as array indices.)
>>>>
>>>>This is not factual, at all.  And it was removed from the Guide.
>>>
>>>What is not factual.  Signed 32 bit indices definitely cause a problem.  you can
>>>look at the asm output to see why...
>>
>>Because it does not apply 100% of the time.  There are times when using a signed
>>int will be faster.
>>
>>Generally speaking, using an unsigned int is faster.  But not always.
>
>I am talking about one specific case:  Using a signed anything (less than 64
>bits) as an array index.  It will never be faster there, because anything less
>than 64 bits has to be extended to 64 bits as all addresses are 64 bits on that
>processor in 64 bit mode...
>
>I can't imagine any case where unsigned is slower, period, however...  The same
>instruction is used...

There are situations when VC can hoist signed lengthening convert out of the
loop, but cannot hoist unsigned convert. Reason is C/C++ semantics -- according
to the Standard signed overflow results in undefined behavior, so compiler can
assume that there would be no overflow.

void foo (unsigned *p, unsigned s, unsigned f, unsigned k, unsigned step)
{
    unsigned i;

    for (i=s; i<f; i+=step) {
        p[i+k]=0;
        k+=k;
    }
}

void bar (int *p, int s, int f, int k, int step)
{
    int i;

    for (i=s; i<f; i+=step) {
        p[i+k]=0;
        k+=k;
    }
}

Code generated for the loops:

$LL3@foo:
; Line 6
	lea	rax, QWORD PTR [r10+r11]
	add	edx, r9d
; Line 7
	add	r11, r11
	add	r10, r9
	cmp	edx, r8d
	mov	DWORD PTR [rcx+rax*4], edi
	jb	SHORT $LL3@foo

$LL3@bar:
; Line 16
	lea	rax, QWORD PTR [r10+rdx]
	add	r10, r11
; Line 17
	add	rdx, rdx
	cmp	r10, r9
	mov	DWORD PTR [rcx+rax*4], r8d
	jl	SHORT $LL3@bar

As you can see, for the "int" loop code is better.

Thanks,
Eugene



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.