Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Processor's

Author: Vincent Diepeveen

Date: 11:17:34 06/17/04

Go up one level in this thread


On June 17, 2004 at 13:34:33, Eugene Nalimov wrote:

>On June 17, 2004 at 13:29:02, Anthony Cozzie wrote:
>
>>On June 17, 2004 at 13:20:40, Eugene Nalimov wrote:
>>
>>>On June 17, 2004 at 06:55:18, Vincent Diepeveen wrote:
>>>
>>>>[...]
>>>>
>>>>Please list the processors in order of L2 cache speed and you'll realize that
>>>>speed still is of overwhelming importance. List them at random access speed for
>>>>L2 cache (some processors are faster in streaming than random access in their
>>>>caches like P4).
>>>>
>>>>Basically opteron has fastest L2 cache which can deliver each 13 cycles data (4
>>>>reads simultaneously even if i understand well). No other processor can deliver
>>>>data from L2 cache that fast.
>>>
>>>Intel Itanium 2 Processor Reference Manual For Software Development and
>>>Optimization, Table 6-4 "Cache Summary":
>>>
>>>Itanium2 cache latency:
>>>  L1: 1 cycle, 4 loads/cycle
Please quote random access times to L1 cache.

>>>  L2: 5 cycles (integer loads), 4 loads/cycle
Please quote random access times to L1 cache.

Note that it is 7 cycles according to Jason Priestly (intel) when doing
*sequential* reads. See his seminar for dutch supercomputer Aster july 2003
where i was watching. www.sara.nl

>>>  L3: 12/14 cycles, depending on cache size (integer loads), 1 load/cycle
14-17 cycles according to jason priestly where 14 is really the 'optimal' case.
But that's not *sequential* read.

>>>
>>>Thanks,
>>>Eugene
>>>
>>
>>Correct me if I am wrong, but aren't Itanium's caches off by 1?  In other words,
>>the 6MB cache on the Itanium is L3, and the L1 cache is like 1KB?
>
itanium : L1D: 16KB 1.4Ghz (the 1.5ghz ones are like $5000)
Opteron : L1D: 64KB 2.4Ghz

>L1I: 16KB ==> blocks of 6 instructions must get used!!!!!
Opteron : 64KB and it doesn't need to store blocks of 6 instructions

>L2:  256KB ==> and does it store INSTRUCTIONS which is the weak spot ????
Opteron : 1MB also storing instructions, itanium?? 13 cycles RANDOM ACCESS

Itanium can only execute blocks of 3 instructions and needs to execute 2 blocks
each clock. That with just 16 KB instruction cache.

So the weak spot of the itanium is also at other terrains. How can you put
double blocks of 6 instructions in 16KB non stop?

You can effectively divide the level caches of itanium by 3.

Intel c++ compiler team in interview ( www.realworldtech.com ) told that their
big problem is keeping the instruction cache filled.

If you can't even keep the instruction cache filled, then what do we talk about?

So it's fast for DSP floating point, slighly faster even than opteron despite
opteron higher clocked, but for the same price of a dual itanium 1.5Ghz you can
also buy a quad opteron 2.4Ghz. So for real DSP a look like stuff you can
cheaper buy x times more opterons.

You quote here DSP sequential cycle times.

>L3:  1.5/3/6MB

Not advantage in computerchess to have one. Of course for itanium it's crucial
to have one because the L1 & L2 is like non existing.

>Thanks,
>Eugene
>
>>It is really amazing to me that Intel can't clock Itanium at 3+ GHZ.
>>
>>anthony



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.