Author: Jeremiah Penery
Date: 16:56:34 08/29/03
Go up one level in this thread
On August 29, 2003 at 18:40:51, Eugene Nalimov wrote: >On August 29, 2003 at 18:32:46, Jeremiah Penery wrote: > >>Of course I know that. My point is that with Opteron, even if you are accessing >>non-local memory *always*, you are not accessing it slower than you would with, >>say, a traditional SMP machine (2x Xeon, for instance). >>Of course you can do a lot better - all I'm saying is that there's no way you're >>going to be doing worse. >> >>Either way you win, even with a crappy NUMA algorithm. > >I am not so sure. With some NUMA implementations each memory bank has limited >bandwith, so if you happened to allocate all the critical data in one node's >memory you'll overload its memory controller. >I had seen a case where SMP application was blindly ported to a 32-CPUs NUMA >system (8 nodes, 4 64-bit CPUs per node, 256Gb RAM total). Application run much >slower on 32 CPUs than on single CPU. I'm not talking about "some NUMA implementations". I'm talking about 2-4 processor Opteron implementation. It should never have any of the problems you describe. Indeed, you can see from SPECRate that it scales very nearly as well as Itanium, and that still with compilers/OS still not very NUMA aware or very good for AMD64.
This page took 0.04 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.