Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dual AMD v Intel Was Re: Here is the comparison !

Author: Robert Hyatt

Date: 08:20:04 11/29/02

Go up one level in this thread


On November 29, 2002 at 01:34:38, Aaron Gordon wrote:

>On November 29, 2002 at 00:03:32, Robert Hyatt wrote:
>
>>On November 28, 2002 at 23:11:02, Aaron Gordon wrote:
>>
>>>On November 28, 2002 at 20:05:11, Robert Hyatt wrote:
>>>
>>>>On November 27, 2002 at 16:57:52, Wayne Lowrance wrote:
>>>>
>>>>>On November 26, 2002 at 11:06:33, Robert Hyatt wrote:
>>>>>
>>>>>>On November 26, 2002 at 08:28:26, Brian Richardson wrote:
>>>>>>
>>>>>>>On November 26, 2002 at 02:28:19, Jorge Pichard wrote:
>>>>>>>
>>>>>>>>On November 25, 2002 at 17:37:29, Brian Richardson wrote:
>>>>>>>>
>>>>>>>>>On November 25, 2002 at 16:29:39, Jorge Pichard wrote:
>>>>>>>>>
>>>>>>>>>>On November 25, 2002 at 16:00:56, Brian Richardson wrote:
>>>>>>>>>>
>>>>>>>>>>>On November 25, 2002 at 14:33:16, Jorge Pichard wrote:
>>>>>>>>>>>
>>>>>>>>>>>>On November 25, 2002 at 10:19:13, Ricardo Gibert wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>On November 25, 2002 at 02:45:35, Jorge Pichard wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>On November 24, 2002 at 23:10:44, Ricardo Gibert wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>On November 24, 2002 at 15:06:55, Jorge Pichard wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>On November 24, 2002 at 14:25:47, Joachim Rang wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>On November 24, 2002 at 14:19:06, Jorge Pichard wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>On November 24, 2002 at 13:15:09, Bob Durrett wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>On November 24, 2002 at 11:49:07, Jorge Pichard wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>http://www.specbench.org/osg/cpu2000/results/res2002q3/cpu2000-20020909-01635.html
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Didn't someone say RDRAM was bad for chess?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Bob D.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>But is still faster than any single processor available with any other memory.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>Pichard.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Athlon XP 2600+ is 17% faster:
>>>>>>>>>>>>>>>>>http://www.specbench.org/osg/cpu2000/results/res2002q3/cpu2000-20020812-01551.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Really, I must be blind.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Pichard.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>And faster still is the Athlon XP 2800+:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>http://www.spec.org/osg/cpu2000/results/res2002q4/cpu2000-20020923-01691.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>You are still missing the point here:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Did you check how many CPU(s)were enabled: = 1 for this test, I did NOT see
>>>>>>>>>>>>>>CPU(s) enabled: = 2
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Pichard.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>Both of guys provide examples with 1 CPU enabled. When I do likewise, I'm
>>>>>>>>>>>>>somehow missing the point. Okey-dokey, I think I can live with that.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>Sorry, I meant to say all of you missed the point, but at the same time when
>>>>>>>>>>>>only 1 CPU is enabled, the Intel can not compete with any Athlon XP 2600+ or
>>>>>>>>>>>>higher. Now if AMD release a Dual 2400 MP, it will beat the @#$+ out of Intel
>>>>>>>>>>>>higher Xeon.
>>>>>>>>>>>>
>>>>>>>>>>>>Pichard.
>>>>>>>>>>>
>>>>>>>>>>>Actually, just the opposite has been shown for 32bit AMD
>>>>>>>>>>>(e.g., slower than dual Xeon).
>>>>>>>>>>>
>>>>>>>>>>>Brian
>>>>>>>>>>
>>>>>>>>>>Probably for the Dual 2200+ but NOT for the upcoming Dual 2400+ MP
>>>>>>>>>>
>>>>>>>>>>http://www.anandtech.com/IT/showdoc.html?i=1747&p=10
>>>>>>>>>>
>>>>>>>>>>And
>>>>>>>>>>
>>>>>>>>>>http://www.anandtech.com/IT/showdoc.html?i=1747&p=12
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Pichard.
>>>>>>>>>
>>>>>>>>>Multiprocessor performance is highly application dependent.
>>>>>>>>>In this case, AMD does much worse with chess applications
>>>>>>>>>(regardless of clock speed), than Intel.
>>>>>>>>>
>>>>>>>>>Thus, while the benchmarks cited above are meaningful, they
>>>>>>>>>only apply to the workloads being tested, which have little to do with
>>>>>>>>>computer chess.
>>>>>>>>
>>>>>>>>Just wait until AMD release the Dual 2600+ MP, and install Deep Fritz on one of
>>>>>>>>this baby and compare it against a Dual Xeon 2.8 Ghz.
>>>>>>>>
>>>>>>>>Pichard.
>>>>>>>
>>>>>>>I will always bow to the data, but based on results so far, dual AMDs have
>>>>>>>a scalability efficiency of about only 1.4x (where 2x would be ideal).
>>>>>>>Dual Intels are at about 1.8-1.9x, at least for good SMP code (like Crafty's).
>>>>>>>This more than makes up for individual AMD CPUs being somewhat faster than
>>>>>>>Intel.  I would expect any 32bit AMD to be about the same, due to memory
>>>>>>>bottlenecks.  This is not an issue for more general workloads.
>>>>>>>The 32-64x Hammers should do much better.
>>>>>>>Brian
>>>>>>
>>>>>>
>>>>>>If you read the "underground analysis" it would seem that this is going to get
>>>>>>worse.  The new AMDs are supposed to have a built-in memory controller on the
>>>>>>processor chip.  Unfortunately, it is now known that it is an inferior
>>>>>>controller compared to late Intel offerings.  The question is, how is AMD going
>>>>>>to respond?  The answer is unknown, but if they don't, they will get their
>>>>>>clock cleaned (again).
>>>>>
>>>>>Again ? The heck you say ! Go to Tom's hardware. Also as for as I know a P4 is
>>>>>much inferior to AMD's performance clock vs clock.
>>>>>Thanks
>>>>>Wayne
>>>>
>>>>
>>>>That has nothing to do with what I said.  AMD implemented a single-channel
>>>>memory controllor.  Intel is now using dual-channel...  And their memory
>>>>bandwidth is steadily going up while AMD is stuck with the design they chose
>>>>for the moment...
>>>>
>>>>Whether their CPU is more efficient or not is one issue.  But clearly their
>>>>duals are significantly worse than Intel's duals...  more so than their single
>>>>cpu speed advantage can cover for...
>>>
>>>Nforce2 is dual-channel DDR board for Tbird/AthlonXP's/AthlonMP's (single cpu).
>>>Also as I've said in the past (and proven, look for the messages I've posted if
>>>you don't remember for some reason) the dual AMD systems running with a 1.7x
>>>speedup can beat any of the P4's even with the P4's having a 1.9x speedup.
>>>
>>>Also, I don't doubt you're a talented programmer, teacher, etc. However. If
>>>you're going to test something please make sure to test it properly. I'm not
>>>trying to be 'mean' or anything like that. I am a perfectionist though and it
>>>really bugs me when I see something done improperly. Dual AMD systems with
>>>Crafty can get a 1.66-1.70x speedup, not the 1.4x you're always posting. If
>>>you're going to test something why not do it right? Thats all I'm asking.
>>
>>_I_ didn't test it.  I don't own a single AMD machine, nor do any of my
>>labs at UAB have a single AMD processor.
>>
>>I rely on people like Eugene, and since he has no vested interest in either
>>chip, and since he is working on the microsoft visual C compiler project, I
>>assume that he knows what he is doing and that his results are correct.  I've
>>not seen anything to contradict the numbers he has posted (1.4X the raw NPS
>>using duals compared to single cpu...)
>>
>>I _did_ run tests on the intel boxes, and reported those results as did
>>Eugene...
>>
>>
>>>
>>>If you're a teacher in a university you should be able to ask around and run one
>>>of the binaries I compiled on a dual AMD system one of the students own.
>>
>>As I said, I have no AMD machines of my own.  I run labs with over 200
>>machines, no AMDs at all.  No students in my classes own AMD processors.
>>I don't believe I can say any more...
>>
>>
>>
>>
>> If you
>>>want to compile it yourself I can give you the compiler options I used and
>>>profiling methods. Now, Slate and I have already done that but if for some
>>>reason you need to see the numbers produced directly infront of you then what I
>>>stated above is a completely viable option.
>>
>>
>>As far as the best AMD beating the best Intel, I don't personally believe it
>>yet.  I have a dual 2.8ghz xeon on the way.  Let's compare when they arrive,
>>with everything wide open including hyper-threading.  I don't believe the AMD
>>machines can keep up, personally...
>
>I find it very hard to believe not a single person in that entire university has
>an AMD cpu. Anyway, reguarding the 1.7x speedup. I haven't tested it with
>MSVC6/7 so I can't say what sort of speedup it has. I do know for a fact (and
>has been testing on numerous occasions) that you can get 1.7x with the Intel C
>compiler.
>


Don't extrapolate.  I didn't say "not a single person at UAB has an AMD
processor."  I said none of the machine in my department are AMD, and nobody
I know here has an AMD including students.  However, UAB has almost 20,000
employees so I won't begin to speak for them...

As far as the compiler goes, I can't comment.  Other than to say in every test
I have seen, Microsoft's compiler is better.  But I don't see how the compiler
can affect the single-cpu vs dual-cpu speedup, because the single-cpu number
is directly influenced by the compiler, making the "speedup" comparable no
matter what happens.

You might have a better chipset or motherboard.  I don't know.  I do know that
everyone that has tested (perhaps except yourself) dual AMDs have reported
disappointing results, which makes me believe there is a memory bottleneck.
The dual 2.8 I have coming has the E7501 server chipset, which I suspect will
perform very well with a threaded crafty running two (and four using SMT)
threads vs one thread...

With SMT on, I expect something greater than 2.0 improvement in raw NPS.  But
rather than speculate, I'll post some numbers after the box arrivs...


>As far as hyperthreading goes.. I highly doubt it's going to have a massive
>positive impact on performance.


It seems to have a 30% improvement for heavy usage.  This includes benchmarks
of Crafty, large database query servers, large web servers, etc.  There are
benchmarks posted all over the internet...

30% is enough to push past AMD...


> This is what it would take to swing Intel into
>the lead. I've seen results putting applications with hyperthreading enabled 15%
>slower to 15% faster than without hyperthreading. Granted it may help crafty a
>little, however it's going to take a lot more than that to to get any sort of P4
>to keep up with a dual MP2600/2800+ setup. From my projections in an earlier
>message even a 2600+ will be able to best a dual P4-3GHz setup. Here is the
>message I was talking about.. perhaps you'll remember.
>
>
>
>>I don't think it's fair for you to find the slowest possible binary for the AMD
>>and some IntelC5 binary and then claim that the speedup is slow. I don't think
>>it's fair either if someone takes a slow binary for a P4 and compares it to a
>>fast binary for an AMD cpu.
>
>>You seem to conveniently forget the benchmarks I've done and other people here
>>have done. Take a look at my latest graph of crafty results:
>>http://speedycpu.dyndns.org/crafty/craftybench4.jpg
>>Note: the P4 2.76GHz is an overclocked 1.8A northwood at 153.5fsb(614MHz >RDRAM).
>>
>>Now, the SMP binaries I have are able to produce a 1.7x speedup in the
>>benchmark. You claim the P4's get 1.8x, thats fine. Take the P4-2.76's result
>>(1,120,011 nps) and multiply it by 1.8. You get 2,016,019.8 nps. Not too >shabby,
>>right? Well.. take the 1.86Ghz XP and multiply it's nps by 1.7 and you get
>>2,035,330.1. Still faster. Now, if you're saying, "Well yadda yadda is
>>overclocked and etc etc". Yeah, and even faster things will be released here
>>shortly. I can guarantee the P4-2.76 w/ 614MHz RDRAM would be as fast or a hair
>>faster than a standard P4-2.8. The AthlonXP at 1.86 would be more around a >2300+
>>if such a thing existed.
>>
>>Moving on to the future.. P4-3GHz will soon be released as well as the 2800+
>>(being announced on October 1st). Lets do some rough guessing. If a P4 gets
>>1,120,011 nps @ 2.76 it should get about 1,217,403 nps at 3GHz and thats
>>probably still having the RDRAM clocked to insanity. Take the 2.52GHz AthlonXP >@
>>1,578,197. At 2133MHz (AthlonXP 2600+) it should do about 1,335,831 nps. Again
>>do 1,335,831 * 1.7 and 1,217,403 * 1.8 and you get:
>>2,270,912.7 nps for the dual XP 2600+ (2.13ghz)
>>2,191,325.4 nps for the dual P4-3GHz.
>
>>Since Crafty is pretty linear you know these numbers are very close to the
>>actual results. So far from what I've seen Pentium4's need an entire GHz more
>>and twice the L2 cache just to come close. This is what I call a $500 keychain.



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.