Author: Tom Kerrigan
Date: 12:09:18 02/10/03
Go up one level in this thread
On February 10, 2003 at 02:41:50, Matt Taylor wrote: >>It can do _static_ reordering, not dynamic. >Reordering is reordering. Optimization at compile-time has more potential than >optimization at run-time. Run-time reordering has limited foresight. More potential, limited foresight, blah blah blah. No matter how many vague notions you attribute to IA-64, you still can't explain why it's not faster per-clock than several similarly-clocked OOO chips. Arguing with you about this is worthless. >Dynamic reordering is valuable when you have a few registers so you can kind've >sort've make use of the 40 internal registers on IA-32 chips, but IA-64 has >many. So what? OOO is said to increase 21264 performance by 30%. The 21264, BTW, has 32 registers and 40 reorder registers. >Yes. It appears I was looking at a 32-bit Sparc machine. I was reading a paper Have any 32 bit SPARCs been made since 1995? >It seems the SPEC scores are generally higher on chips with more cache, and the >only McKinley score listed has a 1.5 MB L3 cache. I can't seem to access SPEC scores right now, but what's the point of a super-awesome post-RISC ISA if it's just going to get beat by chips with more cache? And if cache really is the limiting factor in McKinley's performance here, it must be idle a significant amount of time, which reduces IPC and means HT would be beneficial. >Again, I have no actual experience with an IA-64 machine because they're rather >expensive. I can only rely on what I've read. I have never read anything about >low IPC on IA-64. Please offer some evidence/article. It can still be relatively high and benefit from HT. >In compiler-generated code, my Athlon tends to retire closer to 2 instructions >per clock. I would assume that McKinley does better. The restrictions really Which tool are you using to measure that? >>>ignoring the Intel C exception of using scalar SSE -- not useful to chess >>>programs, not very good justification of SSE either when they could have >>>introduced new flat-register FP instructions.) >Original SSE is flat-register FP. SSE 2 allows double-precision FP computation. How do you make these two statements agree? -Tom
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.