Author: Vincent Diepeveen
Date: 11:17:34 06/17/04
Go up one level in this thread
On June 17, 2004 at 13:34:33, Eugene Nalimov wrote: >On June 17, 2004 at 13:29:02, Anthony Cozzie wrote: > >>On June 17, 2004 at 13:20:40, Eugene Nalimov wrote: >> >>>On June 17, 2004 at 06:55:18, Vincent Diepeveen wrote: >>> >>>>[...] >>>> >>>>Please list the processors in order of L2 cache speed and you'll realize that >>>>speed still is of overwhelming importance. List them at random access speed for >>>>L2 cache (some processors are faster in streaming than random access in their >>>>caches like P4). >>>> >>>>Basically opteron has fastest L2 cache which can deliver each 13 cycles data (4 >>>>reads simultaneously even if i understand well). No other processor can deliver >>>>data from L2 cache that fast. >>> >>>Intel Itanium 2 Processor Reference Manual For Software Development and >>>Optimization, Table 6-4 "Cache Summary": >>> >>>Itanium2 cache latency: >>> L1: 1 cycle, 4 loads/cycle Please quote random access times to L1 cache. >>> L2: 5 cycles (integer loads), 4 loads/cycle Please quote random access times to L1 cache. Note that it is 7 cycles according to Jason Priestly (intel) when doing *sequential* reads. See his seminar for dutch supercomputer Aster july 2003 where i was watching. www.sara.nl >>> L3: 12/14 cycles, depending on cache size (integer loads), 1 load/cycle 14-17 cycles according to jason priestly where 14 is really the 'optimal' case. But that's not *sequential* read. >>> >>>Thanks, >>>Eugene >>> >> >>Correct me if I am wrong, but aren't Itanium's caches off by 1? In other words, >>the 6MB cache on the Itanium is L3, and the L1 cache is like 1KB? > itanium : L1D: 16KB 1.4Ghz (the 1.5ghz ones are like $5000) Opteron : L1D: 64KB 2.4Ghz >L1I: 16KB ==> blocks of 6 instructions must get used!!!!! Opteron : 64KB and it doesn't need to store blocks of 6 instructions >L2: 256KB ==> and does it store INSTRUCTIONS which is the weak spot ???? Opteron : 1MB also storing instructions, itanium?? 13 cycles RANDOM ACCESS Itanium can only execute blocks of 3 instructions and needs to execute 2 blocks each clock. That with just 16 KB instruction cache. So the weak spot of the itanium is also at other terrains. How can you put double blocks of 6 instructions in 16KB non stop? You can effectively divide the level caches of itanium by 3. Intel c++ compiler team in interview ( www.realworldtech.com ) told that their big problem is keeping the instruction cache filled. If you can't even keep the instruction cache filled, then what do we talk about? So it's fast for DSP floating point, slighly faster even than opteron despite opteron higher clocked, but for the same price of a dual itanium 1.5Ghz you can also buy a quad opteron 2.4Ghz. So for real DSP a look like stuff you can cheaper buy x times more opterons. You quote here DSP sequential cycle times. >L3: 1.5/3/6MB Not advantage in computerchess to have one. Of course for itanium it's crucial to have one because the L1 & L2 is like non existing. >Thanks, >Eugene > >>It is really amazing to me that Intel can't clock Itanium at 3+ GHZ. >> >>anthony
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.