Author: Gerd Isenberg
Date: 01:52:26 07/18/04
Go up one level in this thread
>I am guessing something like 50 cycles? Really not that bad . . . probably >close to the speed of a scan over attack tables. > >anthony Yes, less than 30 SSE2-instructions, almost no register stalls, but about 50 cycles ;-( Ok, double direct path instructions have almost 2 cycles latency, 4 cycles if memory operand, psadbw has 4. I'll hope AMD's promise from optimization guide comes true some day: Chapter 9 Optimizing with SIMD Instructions ... • Future processors with more or wider multipliers and adders will achieve better throughput using SSE and SSE2 instructions. (Today’s processors implement a 128-bit-wide SSE or SSE2 operation as two 64-bit operations that are internally pipelined.) Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.