Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: P4?

Author: Robert Hyatt

Date: 08:43:05 10/16/03

Go up one level in this thread


On October 16, 2003 at 09:49:06, Gerd Isenberg wrote:

><snip>
>>>P4 (and AMD64) hah 8 128-bit SSE2 registers that can be treated (among other
>>>things) as 4 32-bit floats or 2 64-bit floats. You can do some operations on
>>>those registers in parallel, for example you can add two float vectors of length
>>>2 using one instruction. I am not sure if current P4 implementation performs
>>>that addition in one cycle (that is definitely no so for Opteron/AMD64), but
>>>nothing in theory prevents this.
>>>
>>
>>IIRC, two cycles latency for most common logical, arithmetical and shift
>>mmxReg[,mmxReg]-instructions on P4 (movdqa reg,reg takes 6!), SIMD float as well
>>as double and integer (plus 1 cycle throughput).
>
>
>oups, sorry, float and double arithmetic instructions have higher latency, on P4
>as well on AMD64. Two cycles is only true for SSE2 integer instructions (i used
>so far), such as pand,por,pxor,padd...
>
>Intel ® Pentium ® 4
>and Intel ® Xeon™
>Processor Optimization
>Reference Manual
>
>(latency,throughput):
>
>ADDPS xmm, xmm 4,2
>ADDPD xmm, xmm 4,2
>MULPD xmm, xmm 6,2
>
>
>> I think same for AMD64, so
>>called double direct path instructions, decoded as two 64-bit macro ops.
>
>Software Optimization
>Guide for AMD Athlon™ 64
>and
>AMD Opteron™ Processors
>
>Latency:
>
>ADDPS xmm, xmm 5
>ADDPD xmm, xmm 5
>MULPD xmm, xmm 5
>
>Gerd


Being off by a factor of 2-3 is not very significant when dealing with numbers
posted by Vincent.  :)  a number within a factor of 2-3 for him is an
_excellent_ approximation.  However, that notwithstanding, the C90 can certainly
produce 6 floating point results every 4 nanoseconds, as a theoretical peak.  It
can sustain 4 (about 1 GFLOPS) with many codes, and go beyond that for certain
special cases.  It is hard to imaging a PC coming anywhere near that, even
comparing today's PC with the C90 which is 13 years old.  There are much faster
Cray's around today...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.