Author: Eugene Nalimov
Date: 09:29:42 03/28/04
Go up one level in this thread
On March 28, 2004 at 11:58:46, Robert Hyatt wrote:
>On March 27, 2004 at 20:12:27, Slater Wold wrote:
>
>>On March 27, 2004 at 17:11:31, Robert Hyatt wrote:
>>
>>>On March 26, 2004 at 16:09:31, Slater Wold wrote:
>>>
>>>>On March 26, 2004 at 12:09:26, Russell Reagan wrote:
>>>>
>>>>>On March 26, 2004 at 09:15:12, Fabien Letouzey wrote:
>>>>>
>>>>>>Would there be a reason why, for instance, 16-bit integers would be slower in
>>>>>>64-bit mode?
>>>>>
>>>>>Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™
>>>>>
>>>>>2.22 Using Unsigned Integers for 32-Bit Array Indices
>>>>>
>>>>>Optimization
>>>>>When using a 32-bit variable as an array index, declare the variable as an
>>>>>unsigned integer instead of a signed integer.
>>>>>
>>>>>Application
>>>>>This optimization applies to 64-bit software.
>>>>>
>>>>>Rationale
>>>>>When performing 64-bit address arithmetic, the compiler must insert an
>>>>>additional instruction into the object code to sign-extend a signed 32-bit
>>>>>integer, which reduces performance; no additional instruction is necessary to
>>>>>zero-extend an unsigned 32-bit integer because the processor performs
>>>>>zero-extension automatically. (There is no performance penalty for using 64-bit
>>>>>variables—either signed or unsigned—as array indices.)
>>>>
>>>>This is not factual, at all. And it was removed from the Guide.
>>>
>>>What is not factual. Signed 32 bit indices definitely cause a problem. you can
>>>look at the asm output to see why...
>>
>>Because it does not apply 100% of the time. There are times when using a signed
>>int will be faster.
>>
>>Generally speaking, using an unsigned int is faster. But not always.
>
>I am talking about one specific case: Using a signed anything (less than 64
>bits) as an array index. It will never be faster there, because anything less
>than 64 bits has to be extended to 64 bits as all addresses are 64 bits on that
>processor in 64 bit mode...
>
>I can't imagine any case where unsigned is slower, period, however... The same
>instruction is used...
There are situations when VC can hoist signed lengthening convert out of the
loop, but cannot hoist unsigned convert. Reason is C/C++ semantics -- according
to the Standard signed overflow results in undefined behavior, so compiler can
assume that there would be no overflow.
void foo (unsigned *p, unsigned s, unsigned f, unsigned k, unsigned step)
{
unsigned i;
for (i=s; i<f; i+=step) {
p[i+k]=0;
k+=k;
}
}
void bar (int *p, int s, int f, int k, int step)
{
int i;
for (i=s; i<f; i+=step) {
p[i+k]=0;
k+=k;
}
}
Code generated for the loops:
$LL3@foo:
; Line 6
lea rax, QWORD PTR [r10+r11]
add edx, r9d
; Line 7
add r11, r11
add r10, r9
cmp edx, r8d
mov DWORD PTR [rcx+rax*4], edi
jb SHORT $LL3@foo
$LL3@bar:
; Line 16
lea rax, QWORD PTR [r10+rdx]
add r10, r11
; Line 17
add rdx, rdx
cmp r10, r9
mov DWORD PTR [rcx+rax*4], r8d
jl SHORT $LL3@bar
As you can see, for the "int" loop code is better.
Thanks,
Eugene
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.