Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: The need to unmake move

Author: Eugene Nalimov

Date: 15:40:51 08/29/03

Go up one level in this thread


On August 29, 2003 at 18:32:46, Jeremiah Penery wrote:

>On August 29, 2003 at 08:46:12, Robert Hyatt wrote:
>
>>On August 28, 2003 at 19:12:52, Jeremiah Penery wrote:
>>
>>>But it's not more latency than you get *best case* when using a traditional SMP
>>>setup.  So you can only gain, even with a "poor algorithm".
>>
>>If you compare an SMP xeon to a dual 486 you _also_ "win".
>
>And what is that supposed to demonstrate?
>
>>But my point was that with a NUMA architecture, you might win a lot less
>>than you could, if the algorithm doesn't take into account the specific
>>architectural issues with a NUMA machine.
>>
>>My point was, again, that you want most references from a CPU to go to its
>>local memory for max performance.  It's an issue on _all_ NUMA-type machines.
>
>Of course I know that.  My point is that with Opteron, even if you are accessing
>non-local memory *always*, you are not accessing it slower than you would with,
>say, a traditional SMP machine (2x Xeon, for instance).
>Of course you can do a lot better - all I'm saying is that there's no way you're
>going to be doing worse.
>
>Either way you win, even with a crappy NUMA algorithm.

I am not so sure. With some NUMA implementations each memory bank has limited
bandwith, so if you happened to allocate all the critical data in one node's
memory you'll overload its memory controller.

I had seen a case where SMP application was blindly ported to a 32-CPUs NUMA
system (8 nodes, 4 64-bit CPUs per node, 256Gb RAM total). Application run much
slower on 32 CPUs than on single CPU.

Thanks,
Eugene



This page took 0.05 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.