Author: Eugene Nalimov
Date: 09:29:42 03/28/04
Go up one level in this thread
On March 28, 2004 at 11:58:46, Robert Hyatt wrote: >On March 27, 2004 at 20:12:27, Slater Wold wrote: > >>On March 27, 2004 at 17:11:31, Robert Hyatt wrote: >> >>>On March 26, 2004 at 16:09:31, Slater Wold wrote: >>> >>>>On March 26, 2004 at 12:09:26, Russell Reagan wrote: >>>> >>>>>On March 26, 2004 at 09:15:12, Fabien Letouzey wrote: >>>>> >>>>>>Would there be a reason why, for instance, 16-bit integers would be slower in >>>>>>64-bit mode? >>>>> >>>>>Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ >>>>> >>>>>2.22 Using Unsigned Integers for 32-Bit Array Indices >>>>> >>>>>Optimization >>>>>When using a 32-bit variable as an array index, declare the variable as an >>>>>unsigned integer instead of a signed integer. >>>>> >>>>>Application >>>>>This optimization applies to 64-bit software. >>>>> >>>>>Rationale >>>>>When performing 64-bit address arithmetic, the compiler must insert an >>>>>additional instruction into the object code to sign-extend a signed 32-bit >>>>>integer, which reduces performance; no additional instruction is necessary to >>>>>zero-extend an unsigned 32-bit integer because the processor performs >>>>>zero-extension automatically. (There is no performance penalty for using 64-bit >>>>>variables—either signed or unsigned—as array indices.) >>>> >>>>This is not factual, at all. And it was removed from the Guide. >>> >>>What is not factual. Signed 32 bit indices definitely cause a problem. you can >>>look at the asm output to see why... >> >>Because it does not apply 100% of the time. There are times when using a signed >>int will be faster. >> >>Generally speaking, using an unsigned int is faster. But not always. > >I am talking about one specific case: Using a signed anything (less than 64 >bits) as an array index. It will never be faster there, because anything less >than 64 bits has to be extended to 64 bits as all addresses are 64 bits on that >processor in 64 bit mode... > >I can't imagine any case where unsigned is slower, period, however... The same >instruction is used... There are situations when VC can hoist signed lengthening convert out of the loop, but cannot hoist unsigned convert. Reason is C/C++ semantics -- according to the Standard signed overflow results in undefined behavior, so compiler can assume that there would be no overflow. void foo (unsigned *p, unsigned s, unsigned f, unsigned k, unsigned step) { unsigned i; for (i=s; i<f; i+=step) { p[i+k]=0; k+=k; } } void bar (int *p, int s, int f, int k, int step) { int i; for (i=s; i<f; i+=step) { p[i+k]=0; k+=k; } } Code generated for the loops: $LL3@foo: ; Line 6 lea rax, QWORD PTR [r10+r11] add edx, r9d ; Line 7 add r11, r11 add r10, r9 cmp edx, r8d mov DWORD PTR [rcx+rax*4], edi jb SHORT $LL3@foo $LL3@bar: ; Line 16 lea rax, QWORD PTR [r10+rdx] add r10, r11 ; Line 17 add rdx, rdx cmp r10, r9 mov DWORD PTR [rcx+rax*4], r8d jl SHORT $LL3@bar As you can see, for the "int" loop code is better. Thanks, Eugene
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.