Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: The need to unmake move

Author: Robert Hyatt

Date: 08:13:12 09/02/03

Go up one level in this thread


On September 02, 2003 at 07:17:58, Gian-Carlo Pascutto wrote:

>On September 01, 2003 at 23:58:38, Jeremiah Penery wrote:
>
>>On September 01, 2003 at 23:53:20, Robert Hyatt wrote:
>>
>>>It is almost guaranteed that _all_ critical search data for _all_ threads will
>>>be allocated in a single processor's local memory.
>>
>>That would be the worst possible usage of memory.  Why in the world would a
>>program perform like that?
>
>Memory is divided in equal parts for NUMA-Opteron AFAIK, with
>each CPU owning one chunk.
>
>Crafty just allocates one continuous big chunk for search structures,
>and hence it's in one processors RAM.
>
>Messy thing about NUMA is the large hardware dependence of the code
>you end up writing.

It is certainly messy.

>
>I'm curious about how to ensure that a chunk of memory you allocate
>is on your local CPU. Just splitting up the splitblock list in per CPU
>pieces, so each CPU has a part in local memory would already remove half of the
>latencies I guess. At the very least the CPU that created the splitblock has
>local access, whereas normally you risk everything goes over remote access.

How this is done varies from machine to machine.  On the Compaq compiler I
was testing on, you used a different form of malloc() that says "I want local
memory to _this_ processor, not memory anywhere that is convenient."

>
>I didn't notice any problems (on the contrary!) when running on a 4-way NUMA
>Opeteron box with my thing, but I'm much less dependent on shared data between
>threads, so even all-remote access isn't killing.

That is one potential benefit of avoiding lightweight threads.  Of course, the
less you share, the more other issues rear up...


>
>--
>GCP



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.