Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: The need to unmake move

Author: Robert Hyatt

Date: 09:09:12 09/03/03

On September 03, 2003 at 02:53:36, Tony Werten wrote:

>On September 02, 2003 at 11:09:21, Robert Hyatt wrote:
>
>>On September 01, 2003 at 23:58:38, Jeremiah Penery wrote:
>>
>>>On September 01, 2003 at 23:53:20, Robert Hyatt wrote:
>>>
>>>>It is almost guaranteed that _all_ critical search data for _all_ threads will
>>>>be allocated in a single processor's local memory.
>>>
>>>That would be the worst possible usage of memory.  Why in the world would a
>>>program perform like that?
>>
>>
>>Do you understand how parallel programming works?  Suppose you want to
>>do this:
>>
>>TREE  blocks[128];
>>
>>Where TREE is a big structure.
>>
>>That puts the blocks into consecutive memory addresses.
>>
>>On a NUMA machine that puts the blocks into one processor's local memory,
>>or it might split across two if you are near the end of one's memory.
>>
>>On a true SMP (non-NUMA) box, that works _perfectly_ and it is the way things
>>are done.  On a NUMA box, it sucks.
>
>I do not know very much about this stuff, but I don't see the problem.
>
>Just malloc a local copy of TREE and copy the global TREE in it. Of coarse this
>isn't optimal, but should work very easy.
>
>Tony

It is more complicated than that due to the recursion...

But the basic idea is correct.  In the compaq port, I simply allocated split
blocks locally for each processor...


>
>>
>>As I said, it takes a _redesign_ of how memory is used, to make a NUMA
>>box run efficiently.  Assumptions that are fine on any SMP box fail on a
>>NUMA box.  IE Crafty runs just fine on a 32 CPU T90 from Cray.  But it uses
>>a crossbar memory switch, not NUMA.  Ditto for my dual/quad boxes here.

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.