Author: Robert Hyatt
Date: 03:15:15 08/18/05
Go up one level in this thread
On August 16, 2005 at 19:00:48, Eugene Nalimov wrote: >On August 15, 2005 at 22:19:36, Robert Hyatt wrote: > >>In NUMA linux, when I malloc() or shmget() or whatever any kind of memory, it >>isn't actually allocated on a specific node until the page is faulted in on a >>reference. This lets me shmget() the TREE data for each process before I fork() >>the processes, then each process initializes its own TREE blocks, which faults >>them into the physical memory on the node where that particular process is >>running. >> >>Does windows behave the same way, or is the mallocInterleaved() approach >>currently used in Crafty the best approach. I'm going to have to do a little >>tweaking to make the current program approach behave on windows, and if windows >>allocates physical memory like linux, it makes the approach work on both, if >>not, oh well... > >Look at the code I wrote. There are 2 functions: > >void *WinMalloc(size_t cbBytes, int iThread) >void *WinMallocInterleaved(size_t cbBytes, int cThreads) > >Basically what is done in fisrt one is: >* remember current CPU affinity mask >* force current thread to be executed on CPU#iThread >* allocate memory >* fill it with zeroes, so it will be committed >* restore CPU affinity mask > >The second function is very similar: >* remember current CPU affinity mask >* loop for CPU 0..N > * force current thread to be executed on that CPU > * allocate some memory > * fill it with zeroes, so it will be committed >* restore CPU affinity mask > >Thanks, >Eugene I understood that part. What wasn't clear was this: Suppose I malloc() everything up front, but do not touch it. Then as threads are spawned, they zero their own "split blocks" which on linux causes those pages to be "faulted in" to the resident set, and the physical RAM is allocated on the local node where they are first accessed. It sort of looks like Windows does the same thing based on your "allocate and touch" approach. Linux gives me a couple of approaches. One as above is the simplest. I can also specify that memory be allocated on a specific node, but I am not sure that is totally compatible with the shmget()/shmat() approach I am using to avoid POSIX threads. What we have certainly works, but if windows behaves like linux, so that I can malloc up front, and then touch as the threads get initialized, overall the code will be a bit simpler since then both will be doing the same thing... Hence my question... :)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.