Author: Gerd Isenberg
Date: 23:51:17 10/20/03
Go up one level in this thread
<snip> >It is true that fact of having 16 registers instead of 8 do help but it is still >not that rosy 128 that we have on Intel's 64 bits. Could those 16 registers on >AMD be helped by some additional and quick usage of registers from coprocessor, >like in Intel's MMX version? > >Leonid. Yes. AMD64 has three register files with own instruction sets and decode/execute units: 1. 16 64-bit general purpose registers rax,..,rdi, r08..r15 2. 16 128-bit XMM registers for SSE/SSE2 (SIMD Streaming Extensions) 3. 8 64-bit MMX registers (shared with x87) SSE(2) and MMX share the same three fp execution units. SSE(2) becomes default fp-unit with windows for AMD64. Unfortenately MMX (x87) is not saved/restore in 64-bit mode context switch. There are SSE(2) integer instructions, for vectors of 8/16/32/64 bits. They are nice to do some independent branchless instructions chains, e.g. with pairs of bitboards. Moving gp <-> xmm/mmx should be avoided, due to slow vector path instructions movq. Using memory for that porpose is also critical, AMD suggests padding about 10 other instructions between 64-bit store and 2*32-bit loads or vice versa. MSC has intrinsic datatypes (__m128i) and functions to use SSE2 without assembler (No inline assembler anymore). Gerd <snip>
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.