Author: Eugene Nalimov
Date: 10:54:07 11/12/03
Go up one level in this thread
On November 12, 2003 at 11:55:20, Gian-Carlo Pascutto wrote: >On November 11, 2003 at 23:42:45, Eugene Nalimov wrote: > >>My point is: it's possible that due to the fact that quad Opteron is NUMA -- >not SMP -- system, for SMP-only program performance on quad Opteron can be >>worse than on *real* quad SMP system, even when for one CPU Opteron >>performance is much better. Itanium was used only as an example of such >>system, I never recommended rewriting any program for it. > >I don't understand how. The NUMA part is RAM. Even worst case on the Opteron >RAM is faster than Xeon SMP. So how could it ever be worse? > >-- >GCP I can think of several reasons why scaling is very bad if all the memory was allocated at one CPU: (1) Memory *bandwidth*. All the memory requests go to exactly that CPU, so all CPUs have to use exactly one (or two) channels to memory. On Xeons *worst case* memory bandwidth is higher. (2) CPU-to-CPU *bandwidth* -- memory transfer speed is limited by the fact that *one* CPU has to process memory requests for for *all* CPUs. Also notice that for "normal" topology 0----1 | | | | 2----3 CPU#3 has to go through either CPU#1 or CPU#2 to reach memory of CPU#0. (3) MOESI vs. MESI synchronisation protocols -- I was told that on MOESI (used by AMD) traffic due to shared *modified* cache lines is much higher than on MESI (used by Intel). If it is really so (I didn't investigated myself) it probably can explain why on 32-bit Athlons Crafty prior to 19.5 scaled worse than on Pentium 4. In any case here are results of Crafty 19.4 scaling on 2 different Opteron systems, and on Itanium2 system (measured before Crafty became NUMA-aware, and we decreased amount of shared modifiable data): Opteron system I: 2 CPUs: 1.57x 3 CPUs: 1.99x 4 CPUs: 1.98x Opteron system II: 2 CPUs: 1.61x 3 CPUs: 2.13x 4 CPUs: 2.35x Itanium2 system: 2 CPUs: 1.84x 3 CPUs: 2.63x 4 CPUs: 3.22x Crafty 19.5 scales much better. On Opteron system II it reaches 3.8x on 4P. Thanks, Eugene
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.