Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: impressive Power Mac G5 !

Author: Tom Kerrigan
Date: 02:19:15 07/25/03
On July 25, 2003 at 02:22:06, Hristo wrote:

>On July 25, 2003 at 00:51:13, Tom Kerrigan wrote:
>
>>On July 24, 2003 at 23:14:52, Hristo wrote:
>>
>>>On July 24, 2003 at 20:43:14, Dann Corbit wrote:
>>>
>>>>On July 24, 2003 at 17:27:55, Vincent Lejeune wrote:
>>>>
>>>>>http://www.apple.com/quicktime/qtv/wwdc03/ -> click "watch now" -> go to
>>>>>1:40:30; You will see Power Mac G5 perform a little more than 2 times faster
>>>>>than a dual Xeon 3.06 !!! Live run, screens side by side, with 4 or 5 different
>>>>>applications.
>>>>
>>>>Figures don't lie.
>>>>But liars will figure.
>>>>
>>>>The shame from Apple's current misinformation campaign won't go away until they
>>>>start telling the truth.
>>>>
>>>>A little distortion is not unexpected.  But they are simply telling absurd tall
>>>>tales.
>>>
>>>Dan,
>>>if the applications being compared were using Altivec optimized code on the Mac
>>>and were dependent heavily on this part of the code, then the Mac being 2 times
>>>faster is easy to imagine.
>>>
>>>What they, Apple, don't tell "you" as a consumer is that only a few Applications
>>>can gain execution speed from Altivec stuff ... and when it does happen you can
>>>often feel that the P4 are slow, which is not the case for the general purpose
>>>Applications.
>>>
>>>Why do you think Apple is not telling the truth? More precisely, what is it that
>>>they are dishonest about, in relation to the above mentioned demo?
>>>
>>>Regards,
>>>Hristo
>>
>>
>>* The IBM guy was bragging about their 0.13um process, saying how great it is,
>>saying that only IBM and Apple could deliver it... Hmmm... Intel has been
>>selling 0.13um P4s for a year and a half and AMD has been selling 0.13um Athlons
>>for, what, a year? Intel is going to be selling 0.09um processors not long after
>>Apple starts shipping the 0.13um processors that only they are awesome enough to
>>deliver, sure.
>>
>>* Steve says the 3.0GHz P4 was the fastest they could buy, but actually they
>>could have bought a 3.2GHz P4.
>>
>>* Steve says the G5 is the world's first 64 bit desktop processor. The Opteron
>>has been out for months now. If you want to argue that the Opteron is a
>>workstation/server processor and not a "dektop" processor, then why are they
>>comparing the G5 to a Xeon?
>>
>
>All of this is not related to the question of "Why did the applications demoed
>performed so much faster on the Macs?".
>I see no point to argue about any of these presentation related inaccuracies ...
>:)

You're right, I didn't read your entire post, i.e., "in relation to the above
mentioned demo?"

>and maybe the people from Adobe, Wolfram Research and Lux... all agreed to lie
>about their experiences with the performance of those systems. :-)

Well, I noticed that the Adobe guy was choosing his words _very_ carefully. He
said something like "some operations may even run up to twice as fast." Not
exactly a mind-blowing endorsement. I like Wolfram's stuff, so I'd like to think
there wasn't any trickery going on with that demo. Lux is a black box to the
public, a research company with no commercial products, and eMagic, well, they
were acquired by Apple. The fact that the latter two companies were doing demos
is "fishy"... makes you wonder why Jobs didn't choose to do demos from companies
that would be less controversial.

The fact that the PC was stalling is the kicker, though. The PC's processor,
front side bus, memory, and hard drive system can not possibly be _so_ much
slower than Apple's that they would take many times longer to load a big picture
of a shark, or stop playing parts of a song completely for seconds at a time.
The only time PCs behave like that is when they're thrashing, which leaves us
three possibilities:

1) The dual Xeon had less RAM than the G5, obviously unfair
2) The G5 was "primed" for the demos, e.g., they ran the Adobe bechmark right
before running it again for the audience, so that physical memory would be all
ready for Photoshop to run and load a huge file, and maybe part of that file was
still even in disk cache. Obviously unfair.
3) Similar to priming, Windows memory allocation may just be done differently,
so it's unfair to do these "cold" tests when the app would perform comparably if
you hadn't just booted it.

>It is possible that the G5 Macs will perform better than the 3.06 Xeons which
>were used for the demo, without the Xeons being crippled in any way. Don't you
>think?

Sure. I'm sure there are many real world cases where the G5 will outperform PCs.
I'm eager to see how fast my chess program runs on one, although I doubt I'd buy
one even if it is faster--I'm not that rich.

>>BTW, you mention Altivec. What makes you think IBM's implementation of Altivec
>>is any better than Intel's implementation of SSE2?
>
>Where to start on this one? :)
>SSE2 provides a fine optimization path for the P4 class CPUs.
>However, to compare the _implementation_ of the SSE2 and Altivec is somewhat
...
>IBMs implementation of Altivec might not be better than Motorola, but that is a
>different conversation, eh? ;-)

No, it's exactly the same conversation. As I understand it, only IBM knows
specifics about how Altivec is implemented in the G5. It's quite possible that
the issue width, issue restrictions, forwarding latencies, and whatnot result in
an Altivec implementation that performs no better (or possibly worse) than
Motorola's, in which case it wouldn't really matter if the app was "Velocity
Engine" optimized or not, when comparing it to a PC.

>However, the performance benefit from SSE2 coupled with extremely high clock
>makes the P4 one mean-machine. Further more ICC does, in some cases, optimize
>using the SSE2 which is certainly a great help for the programmer and a complete
>nightmare for a PR person trying to compare CPU performance without being able
>to use to special features of his own CPU. I would imagine the faces people will
>make when/if Apple manages to tweak gcc to do auto vectorization on the said
>SPEC tests. That is a very big _IF_ and a very long __when__. :) IMO

I don't think the autovectorization that ICC does has much of an impact. The big
impact is avoiding x87 altogether and using SSE2 for scalars. (ICC can output
code that's purely SSE2 but not vectorized. No x87 whatsoever.) I remember when
Intel switched to SSE2 for SPEC submissions and it wasn't that big a deal, maybe
10-15%, so it's not putting Apple at any huge disadvantage.

-Tom
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.