Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: IA-64 vs OOOE (attn Taylor, Hyatt)

Author: Eugene Nalimov

Date: 10:45:49 02/18/03

Go up one level in this thread


On February 18, 2003 at 13:20:38, Tom Kerrigan wrote:

>On February 16, 2003 at 00:04:34, Eugene Nalimov wrote:
>
>>On February 15, 2003 at 21:37:19, Tom Kerrigan wrote:
>>
>>>On February 13, 2003 at 15:46:26, Eugene Nalimov wrote:
>>>
>>>>Some time ago I wrote in the (lost) thread: please compare Itanium2 not with
>>>>P4/Athlon/etc., but with server CPUs. I.e. with 4-way Xeons, Power4, etc.
>>>
>>>I don't know how I got dragged into talking about x86 at all. Everybody seems to
>>>assume that I want to prove that the P4 is better than the I2 in all cases no
>>>matter what. The comparison I'm interested in is Opteron vs. I2...
>>>
>>>>benchmarks it resembles real-world server-like code a most: it's (relatively)
>>>>large, execution time not spent in several functions but heavily spread across
>>>>lot of functions, lot of loops across pointer chains, lot of calls,
>>>>unpredictable branches, etc.
>>>
>>>In other words, it's suited to chips with lots of cache or low latency memory
>>>and low branch mispredict penalties. Opteron addresses all of these issues.
>>
>>Why it should be better than Power4+?
>
>Higher clock speeds (if not initially, shortly thereafter) and lower branch
>mispredict penalties. (12 vs 17 stage pipeline.) Also, is the POWER4's memory
>controller on-die or just in-package? If it's not on-die, that's a big memory
>latency advantage for x86-64...

(1) Why x86-64 should has higher clock speed? 1,450MHz Power4+ is shipping for
several months already. The only possibility I see is that IBM will not clock
Power4+ aggressively, mainly because RAS for server CPUs is more important than
pure speed. But than I doubt that AMD would clock server version of x86-64
aggressively, too.
(2) You yourself wrote "lots of cache or low latency memory". Power4+ has so
much cache that I believe it would more than compensate faster x86-64 memory
latency.

Returning to the original point: I still don't see why x86-64 should be better
than Power4+, and on the server-like code current Itanium2 is faster (and, BTW,
cheaper) than higher clocked Power4+. So probably for such code in-order
execution is not much worse than OoO (as you suggested)? Or probably not worse
at all?

Of course for now it's all waiving hands in the air. We don't have enough data,
because there are no in-order aggressively designed desktop CPUs. I pointed that
for server applications the good in-order design is not worse than much more
mature OoO design with faster clock speed and 10-30x more cache, and we should
ask "is chess program more resemble server application, or desktop application"?

Let's wait till x86-64 launch. But next generation Itanium CPUs will be shipped
at that moment, too :-)

Thanks,
Eugene

>-Tom



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.