Author: Gian-Carlo Pascutto
Date: 04:17:58 09/02/03
Go up one level in this thread
On September 01, 2003 at 23:58:38, Jeremiah Penery wrote: >On September 01, 2003 at 23:53:20, Robert Hyatt wrote: > >>It is almost guaranteed that _all_ critical search data for _all_ threads will >>be allocated in a single processor's local memory. > >That would be the worst possible usage of memory. Why in the world would a >program perform like that? Memory is divided in equal parts for NUMA-Opteron AFAIK, with each CPU owning one chunk. Crafty just allocates one continuous big chunk for search structures, and hence it's in one processors RAM. Messy thing about NUMA is the large hardware dependence of the code you end up writing. I'm curious about how to ensure that a chunk of memory you allocate is on your local CPU. Just splitting up the splitblock list in per CPU pieces, so each CPU has a part in local memory would already remove half of the latencies I guess. At the very least the CPU that created the splitblock has local access, whereas normally you risk everything goes over remote access. I didn't notice any problems (on the contrary!) when running on a 4-way NUMA Opeteron box with my thing, but I'm much less dependent on shared data between threads, so even all-remote access isn't killing. -- GCP
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.