Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: An Opteron note

Author: Gerd Isenberg

Date: 02:29:43 11/27/03

Go up one level in this thread


On November 26, 2003 at 21:18:05, Robert Hyatt wrote:

>I have been converting the X86.s file to work in 64 bit mode on the Opteron
>system I am playing with.  And I must say that after studying the Opteron
>64 bit instruction set, I'm impressed.
>
>First, all the old opcodes work..   mov, sub, bsf, etc..
>
>Second, the familiar 8 32-bit regs are still there.  But they can be
>named %rax rather than %eax to stretch them to 64 bits.  Cute.  And
>then there are 8 more registers you can use with the same old opcodes
>and addressing modes.
>
>In short, it's well-thought-out and very easy to use.  I'll post some
>performance later.  I have PopCnt(), FirstOne() and LastOne() working
>fine.  After I finish the others, I'll see how much (if any) it speeds
>things up.

Hi Bob,

Very curious about your 64-bit results...

I see good chances for Matt Taylor's de Bruijn multiplication to become faster
than a single bsf on AMD64, because bsf reg64 is still 9-cycle vector path
instruction, but 32*32=64bit or 64*64=128bit became direct path intructions on
AMD64 (3/5 cycles). May be you can try it some day...

The additional 8-gp registers r8-r15 (64-bit) or r8D-r15D (32-bit) may be even
used as addressing registers (with REX prefix).

Be aware of the "signed extension" penalty if using signed 32-bit variables as
array indicies, "long" is still 32-bit with msc but 64-bit in gcc:

Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™

2.22 Using Unsigned Integers for 32-Bit Array Indices

Optimization
When using a 32-bit variable as an array index, declare the variable as an
unsigned integer instead of a signed integer.

Application
This optimization applies to 64-bit software.

Rationale
When performing 64-bit address arithmetic, the compiler must insert an
additional instruction into the object code to sign-extend a signed 32-bit
integer, which reduces performance; no additional instruction is necessary to
zero-extend an unsigned 32-bit integer because the processor performs
zero-extension automatically. (There is no performance penalty for using 64-bit
variables—either signed or unsigned—as array indices.)

Cheers,
Gerd








This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.