Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: 64-bit machines

Author: Tom Kerrigan

Date: 12:09:18 02/10/03

Go up one level in this thread


On February 10, 2003 at 02:41:50, Matt Taylor wrote:

>>It can do _static_ reordering, not dynamic.
>Reordering is reordering. Optimization at compile-time has more potential than
>optimization at run-time. Run-time reordering has limited foresight.

More potential, limited foresight, blah blah blah. No matter how many vague
notions you attribute to IA-64, you still can't explain why it's not faster
per-clock than several similarly-clocked OOO chips. Arguing with you about this
is worthless.

>Dynamic reordering is valuable when you have a few registers so you can kind've
>sort've make use of the 40 internal registers on IA-32 chips, but IA-64 has
>many. So what?

OOO is said to increase 21264 performance by 30%. The 21264, BTW, has 32
registers and 40 reorder registers.

>Yes. It appears I was looking at a 32-bit Sparc machine. I was reading a paper

Have any 32 bit SPARCs been made since 1995?

>It seems the SPEC scores are generally higher on chips with more cache, and the
>only McKinley score listed has a 1.5 MB L3 cache.

I can't seem to access SPEC scores right now, but what's the point of a
super-awesome post-RISC ISA if it's just going to get beat by chips with more
cache? And if cache really is the limiting factor in McKinley's performance
here, it must be idle a significant amount of time, which reduces IPC and means
HT would be beneficial.

>Again, I have no actual experience with an IA-64 machine because they're rather
>expensive. I can only rely on what I've read. I have never read anything about
>low IPC on IA-64. Please offer some evidence/article.

It can still be relatively high and benefit from HT.

>In compiler-generated code, my Athlon tends to retire closer to 2 instructions
>per clock. I would assume that McKinley does better. The restrictions really

Which tool are you using to measure that?

>>>ignoring the Intel C exception of using scalar SSE -- not useful to chess
>>>programs, not very good justification of SSE either when they could have
>>>introduced new flat-register FP instructions.)
>Original SSE is flat-register FP. SSE 2 allows double-precision FP computation.

How do you make these two statements agree?

-Tom



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.