Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Question for Eugene

Author: Robert Hyatt

Date: 03:15:15 08/18/05

Go up one level in this thread


On August 16, 2005 at 19:00:48, Eugene Nalimov wrote:

>On August 15, 2005 at 22:19:36, Robert Hyatt wrote:
>
>>In NUMA linux, when I malloc() or shmget() or whatever any kind of memory, it
>>isn't actually allocated on a specific node until the page is faulted in on a
>>reference.  This lets me shmget() the TREE data for each process before I fork()
>>the processes, then each process initializes its own TREE blocks, which faults
>>them into the physical memory on the node where that particular process is
>>running.
>>
>>Does windows behave the same way, or is the mallocInterleaved() approach
>>currently used in Crafty the best approach.  I'm going to have to do a little
>>tweaking to make the current program approach behave on windows, and if windows
>>allocates physical memory like linux, it makes the approach work on both, if
>>not, oh well...
>
>Look at the code I wrote. There are 2 functions:
>
>void *WinMalloc(size_t cbBytes, int iThread)
>void *WinMallocInterleaved(size_t cbBytes, int cThreads)
>
>Basically what is done in fisrt one is:
>* remember current CPU affinity mask
>* force current thread to be executed on CPU#iThread
>* allocate memory
>* fill it with zeroes, so it will be committed
>* restore CPU affinity mask
>
>The second function is very similar:
>* remember current CPU affinity mask
>* loop for CPU 0..N
>  * force current thread to be executed on that CPU
>  * allocate some memory
>  * fill it with zeroes, so it will be committed
>* restore CPU affinity mask
>
>Thanks,
>Eugene


I understood that part.  What wasn't clear was this:

Suppose I malloc() everything up front, but do not touch it.  Then as threads
are spawned, they zero their own "split blocks" which on linux causes those
pages to be "faulted in" to the resident set, and the physical RAM is allocated
on the local node where they are first accessed.  It sort of looks like Windows
does the same thing based on your "allocate and touch" approach.

Linux gives me a couple of approaches.  One as above is the simplest.  I can
also specify that memory be allocated on a specific node, but I am not sure that
is totally compatible with the shmget()/shmat() approach I am using to avoid
POSIX threads.

What we have certainly works, but if windows behaves like linux, so that I can
malloc up front, and then touch as the threads get initialized, overall the code
will be a bit simpler since then both will be doing the same thing...

Hence my question... :)



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.