Author: Vincent Diepeveen
Date: 06:58:13 05/13/02
Go up one level in this thread
On May 12, 2002 at 13:47:00, Jeremiah Penery wrote: >On May 12, 2002 at 06:42:27, Vincent Diepeveen wrote: > >>On May 12, 2002 at 00:31:51, Martin Andersen wrote: >> >>And it is called McKinley and on paper it's impressive >>what it delivers a Mhz. >> >>just a few details i remember: >> 1Ghz , 3MB L2, > >The cache number is wrong. Itanium (and McKinley) have only 32KB of L1 cache >(16KB code/16KB data). Itanium has 96K of L2 cache, McKinley has only 256K of >L2. The 3MB is L3 cache, which is on-chip, with 12-cycle access in McKinley (20 >cycles in Itanium). still is impressive, though the L1 cache bit dissappointing, depending upon how big a word in L1 cache is. I assume 64 bit swords. that makes L1 datacache only 2048 words or so. Still twice bigger than P4 !! >> 6 instructions a clock, > >Theoretically it can execute this, but hardly ever in practice (on integer >code). The reason is that the instructions must be bundled in groups of 6, and >that Itanium is an _in-order_ processor. If there aren't 6 instructions it can >bundle together, it has to issue a bunch of no-ops in the bundle. In addition, >the compiler technology for IA-64 is very immature. I'm sure with better >compilers they will be able to come closer to that theoretical limit. Well at end of 1996 or so they said the same about the pentium pro "who can use 3 instructions a clock?" But it was back then exactly 3x faster for me than a P5-133Mhz which could do only 2 instructions a clock. Of course the reason why DIEP was so much faster on it, was because of the C compilers producing 8 bits + 32 bits code, whereas others who tricked around in 16 bits assembly got nailed, to use a small understatement. In short for C programs like mine this thing might be very fast, especially because it's in order. >> not extreme penalty however for misprediction, loads of registers, and a big L1 cache. > >There is very little penalty for misprediction, since it has full hardware >predication. It also has a ton of registers, but it can only access 128(?) at a >time, and the rest it can get through a large rotating register file, which may >have some penalty associated with it, I don't remember specifically. > >As I said above, the L1 cache is actually very small. 128 registers kicks butt! The L1 cache is heaven compared to the P4! I assume the L2 cache is better than that of the P4/P3 and at K7 level. That takes away some pain too! This sounds like a REAL fast processor for me!! >>What do you need more? >> >>The first cpu was of course not so fast, but making it already was enough >>to impress the world because of the price a cpu intel can make it for. > >I'm not sure what you're talking about here. The Itanium is a very big and very >expensive processor.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.