Author: Robert Hyatt
Date: 19:54:49 09/02/03
Go up one level in this thread
On September 02, 2003 at 18:18:15, Jeremiah Penery wrote: >On September 02, 2003 at 11:09:21, Robert Hyatt wrote: > >>On September 01, 2003 at 23:58:38, Jeremiah Penery wrote: >> >>>On September 01, 2003 at 23:53:20, Robert Hyatt wrote: >>> >>>>It is almost guaranteed that _all_ critical search data for _all_ threads will >>>>be allocated in a single processor's local memory. >>> >>>That would be the worst possible usage of memory. Why in the world would a >>>program perform like that? >> >> >>Do you understand how parallel programming works? Suppose you want to >>do this: >> >>TREE blocks[128]; >> >>Where TREE is a big structure. >> >>That puts the blocks into consecutive memory addresses. > >That's one part of the critical data structures, but there are more parts that >can get placed elsewhere. Maybe. But I use threads. And on NUMA threads are _bad_. One example, do you _really_ want to _share_ all the attack bitmap stuff? That means it is in one processor's local memory, but will be slow for all others. What about the instructions? Same thing. NUMA requires a bit of thinking about the local vs remote memory issue, and the name of the game is replication, with respect to often-used data. On a SMP box, replication wastes memory. On NUMA, it speeds things up significantly. That was what I meant by some "rethinking" of current things. The split blocks _must_ be shared. That is an intrinsic part of my design. That is not hard to do. But split blocks need to exist on _all_ machines and when I give a chunk of work to a processor, all the data needs to be put in a local split block, period. That is doable. Without a lot of headaches... > >>On a NUMA machine that puts the blocks into one processor's local memory, >>or it might split across two if you are near the end of one's memory. >> >>On a true SMP (non-NUMA) box, that works _perfectly_ and it is the way things >>are done. On a NUMA box, it sucks. > >Then on an SMP machine it gets put into one memory bank, and interleaving can't >be used, and you're still screwed. Not so. Interleaving puts consecutive words of memory into consecutive banks, which is the reason it is called "interleaving" in the first place. :) Now I can read consecutive words in parallel from different banks...
This page took 0.04 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.