Author: Gerd Isenberg
Date: 02:29:43 11/27/03
Go up one level in this thread
On November 26, 2003 at 21:18:05, Robert Hyatt wrote: >I have been converting the X86.s file to work in 64 bit mode on the Opteron >system I am playing with. And I must say that after studying the Opteron >64 bit instruction set, I'm impressed. > >First, all the old opcodes work.. mov, sub, bsf, etc.. > >Second, the familiar 8 32-bit regs are still there. But they can be >named %rax rather than %eax to stretch them to 64 bits. Cute. And >then there are 8 more registers you can use with the same old opcodes >and addressing modes. > >In short, it's well-thought-out and very easy to use. I'll post some >performance later. I have PopCnt(), FirstOne() and LastOne() working >fine. After I finish the others, I'll see how much (if any) it speeds >things up. Hi Bob, Very curious about your 64-bit results... I see good chances for Matt Taylor's de Bruijn multiplication to become faster than a single bsf on AMD64, because bsf reg64 is still 9-cycle vector path instruction, but 32*32=64bit or 64*64=128bit became direct path intructions on AMD64 (3/5 cycles). May be you can try it some day... The additional 8-gp registers r8-r15 (64-bit) or r8D-r15D (32-bit) may be even used as addressing registers (with REX prefix). Be aware of the "signed extension" penalty if using signed 32-bit variables as array indicies, "long" is still 32-bit with msc but 64-bit in gcc: Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ 2.22 Using Unsigned Integers for 32-Bit Array Indices Optimization When using a 32-bit variable as an array index, declare the variable as an unsigned integer instead of a signed integer. Application This optimization applies to 64-bit software. Rationale When performing 64-bit address arithmetic, the compiler must insert an additional instruction into the object code to sign-extend a signed 32-bit integer, which reduces performance; no additional instruction is necessary to zero-extend an unsigned 32-bit integer because the processor performs zero-extension automatically. (There is no performance penalty for using 64-bit variables—either signed or unsigned—as array indices.) Cheers, Gerd
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.