Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Where is now 64 bits Intel's computer?

Author: Jeremiah Penery

Date: 21:15:59 05/13/02

Go up one level in this thread


On May 13, 2002 at 09:58:13, Vincent Diepeveen wrote:

>On May 12, 2002 at 13:47:00, Jeremiah Penery wrote:
>
>>On May 12, 2002 at 06:42:27, Vincent Diepeveen wrote:
>>
>In short for C programs like mine this thing might be very fast,
>especially because it's in order.

Can you give one possible reason why in-order execution would ever be faster
than out-of-order?  The general rule, AFAIK, is that out-of-order execution
increases speed about 30% on normal integer code.

In-order execution forces the compiler to do _all_ instruction scheduling.
Scheduling for a processor as wide as Itanium must be a nightmare for most code.
 In addition, "The compiler adds branch hints, register stack and rotation, data
and control speculation, and memory hints into EPIC instructions." (quoted from
http://www.sharkyextreme.com/hardware/guides/itanium/3.shtml)

Managing all those registers, instruction units, and all that other stuff
requires a super-smart compiler.  The compiler is not mature enough yet to
exploit the full potential of this hardware.  I would bet that IA-64 speed could
increase by at least 50% by improvement in compiler technology alone.

>>> not extreme penalty however for misprediction, loads of registers, and a big L1 cache.
>>
>>There is very little penalty for misprediction, since it has full hardware

What I said was slightly wrong.  The misprediction penalty is the same as for
any other processor (the pipeline must be flushed and restarted, 11 cycles or
something in Itanium).  However, predication allows there to be fewer
mispredictions.  The way it works is that in an if->then/else situation, the
'then' and the 'else' are computed in parallel, and only the needed result is
taken.

>>predication.  It also has a ton of registers, but it can only access 128(?) at a
>>time, and the rest it can get through a large rotating register file, which may
>>have some penalty associated with it, I don't remember specifically.
>>
>>As I said above, the L1 cache is actually very small.
>
>128 registers kicks butt!
>
>The L1 cache is heaven compared to the P4!
>
>I assume the L2 cache is better than that of the P4/P3 and at K7
>level. That takes away some pain too!

McKinley has a really nice cache structure, with very low latency.  Of course,
Intel has always been able to make a really nice cache.

>This sounds like a REAL fast processor for me!!

SpecINT for 800MHz Itanium is 365 - about half of current P4/Athlon numbers.
For Crafty, Itanium had a runtime of 252; compare that to the runtime for
1733MHz AthlonXP of 97.8.  So the Itanium does 84% as much work on Crafty per
clock cycle, and of course the McKinley should be quite a bit better.  That
wouldn't be too bad if they could clock Itanium at anywhere near x86 speeds.

On FP, Itanium is already quite good, and McKinley will be pretty killer in that
department.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.