Author: Robert Hyatt
Date: 10:55:00 12/14/02
Go up one level in this thread
On December 14, 2002 at 01:09:12, Matt Taylor wrote: >On December 13, 2002 at 23:05:39, Robert Hyatt wrote: > >>On December 13, 2002 at 21:16:55, Matt Taylor wrote: >> >>>On December 12, 2002 at 10:15:16, Vincent Diepeveen wrote: >>> >>>>On December 11, 2002 at 02:34:33, Matt Taylor wrote: >>>> >>>>[snip] >>>>>Eugene's explanation fits, though. I am suprised that Intel did not duplicate >>>>>the trace cache for both logical CPUs. It's like trying to fit an even bigger >>>>>peg into an already too small hole... >>>>>-Matt >>>> >>>>Exactly, but the hardware reason to do that is very simply. >>>> >>>>They can clock the thing to 3.04Ghz now. 2.8Ghz for the Xeon. >>>> >>>>But if you double the L1 data cache size or the trace cache size >>>>(i will not do a statement what in my eyes is smarter to duplicate >>>>because you can see my next sentence why) then you have a major other >>>>problem. >>>> >>>>You won't be able to clock it to 3.04Ghz then nor 2.8Ghz for the Xeon. >>>> >>>>If you have something small, you can clock it high. >>>> >>>>If you have something big like an Itanium2 or the 128KB L1 cache >>>>of a K7 then you can't clock it that easily to 3.04Ghz. >>>> >>>>So the clocking and the size of such important integrated things into >>>>the procesor is very closely related. >>> >>>Actually that's not true. There are some 42 million transistors on the P4 >>>Northwood -- more than on Athlon or any IA-32 processor prior to it. Yet it >>>clocks up higher than any competing chips have. The trick isn't to make things >>>simple; it's to split them up. >>> >>>Two independant trace caches would scale fine without adding significant cost to >>>the processor. However, it would impede Intel profit margins because it would >>>require a bit of redesign. >>> >>>-Matt >> >> >>I don't think anyone would want two separate trace caches, as that would violate >>the very principle of hyper-threading. Rather, a larger trace cache, with a >>wider path out so that 2x the micro-ops can be spit out at once would be the >>hyper-thread design approach keeping with the spirit of SMT overall. > >The core principle of HT is that you have 2 threads using one set of execution >units, not necessarily all of the chip's facilities. Correct... except that the added hardware is minimized to make the chip price increase minimal, and yield basically unaffected. > >Doubling the size and width of the u-op cache would work, too, but I think it >would be more difficult to build than a mux of two caches. I don't think doubling the bus out of the trace cache would be much of a problem, particularly as density drops to sub-.10 micron soon. And increasing the width is simpler than trying to manage two caches and having them both feed the pool of execution units, unless I am overlooking something obvious... > >-Mat
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.