Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: IA-64 vs OOOE (attn Taylor, Hyatt)

Author: Eugene Nalimov

Date: 12:46:26 02/13/03

Go up one level in this thread


Tom,

Some time ago I wrote in the (lost) thread: please compare Itanium2 not with
P4/Athlon/etc., but with server CPUs. I.e. with 4-way Xeons, Power4, etc.

You'll be surprised. Itanium2 looks good, even when compared with much more
mature CPUs (and compilers), especially on *server-like* code. E.g. take a look
at SPEC2k/gcc. That is the largest SPEC2k benchmark, and of all SPEC2k
benchmarks it resembles real-world server-like code a most: it's (relatively)
large, execution time not spent in several functions but heavily spread across
lot of functions, lot of loops across pointer chains, lot of calls,
unpredictable branches, etc.

Please note that Itanium2 system has much less cache than Power4+ systems (3x
less L1, 6x less L2, 10x to 40x less L3). Alphas have no L3 cache, but again,
much more L1 and/or L2 cache.

http://www.spec.org/osg/cpu2000/results/res2002q3/cpu2000-20020711-01469.html
http://www.spec.org/osg/cpu2000/results/res2002q4/cpu2000-20021111-01822.html
http://www.spec.org/osg/cpu2000/results/res2002q4/cpu2000-20021111-01814.html
http://www.spec.org/osg/cpu2000/results/res2003q1/cpu2000-20030113-01918.html
http://www.spec.org/osg/cpu2000/results/res2002q4/cpu2000-20021104-01757.html

Also please note that results for Power4+ and Alpha systems were submitted
several months later than for Itanium2.

Thanks,
Eugene

On February 13, 2003 at 14:20:19, Tom Kerrigan wrote:

>On February 12, 2003 at 23:20:52, Matt Taylor wrote:
>
>>80% accuracy when you do it at runtime. The compiler can know the -exact-
>>probabilities of each branch and take advantage of this. The compiler can know
>>with near-100% certainty where most branches will go. The only variable is
>>input, and every combination of input is assumed to have equal probability.
>
>That's absurd. Dynamic branch prediction is over 90% accurate (over 95% for the
>P4) and static branch prediction is at best 80% accurate, and that's profile
>directed. You tell me which is closer to 100%. The reasons should be obvious if
>you think about it.
>
>>>What, exactly, do you think the point of predication is, then? It's to allow
>>>instructions to execute before the condition is determined, in other words, out
>>>of order. (Or at least in order without being dependent.) If you think
>>>predicated instructions are only executed after the condition is determined,
>>>then what is the difference between a "predicated branch" and a normal branch,
>>>besides some extra instructions?
>>Predication avoids small conditional branches such as the infamous abs, max, and
>>min functions.
>
>Sure, you can avoid having an actual branch instruction. I'm asking you to think
>deeper. How does that make the processor go any faster?
>
>>>And every other SPEC program shows that "in practice" McKinley is clearly slower
>>>than a P4.
>>So there are two results, and you prefer to throw away one rather than
>>attempting an explanation.
>
>No, more like 12 results and in only one case does the Itanium 2 outperform the
>P4. And I think I've done a very good job explaining why Crafty runs faster on
>the I2 than the P4.
>
>-Tom



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.