Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hash Table question

Author: Jay Urbanski

Date: 19:20:14 02/07/06

Go up one level in this thread


On February 07, 2006 at 11:03:14, Robert Hyatt wrote:

>On February 06, 2006 at 23:10:08, Jay Urbanski wrote:
>
>>On February 06, 2006 at 12:12:06, Robert Hyatt wrote:
>>
>>>On February 05, 2006 at 23:49:39, Jay Urbanski wrote:
>>>
>>>>On February 05, 2006 at 06:10:55, Vasik Rajlich wrote:
>>>>
>>>>>On February 04, 2006 at 12:19:33, dhanial wrote:
>>>>>
>>>>>>What is the best hash table settings for rybka for 10 to 15 min games and for 1
>>>>>>to 3 min games.
>>>>>
>>>>>For all time controls, as high as possible, provided that there is no swapping
>>>>>to disk.
>>>>>
>>>>>Vas
>>>>
>>>>This is not actually true on all systems.  For instance, if you have dual
>>>>processor Opteron it would behoove you to keep the hash size to under half your
>>>>physical memory at least, because Opteron systems are NUMA and access to memory
>>>>attached to the non-local CPU is higher than latency to memory attached to the
>>>>CPU your process is running on.  It's about 60-80ns latency for local memory and
>>>>about 100-120ns for 1-hop away.
>>>
>>>Not all dual opterons are NUMA.  And not all NUMA-capable opterons really look
>>>like NUMA either.  You can configure the BIOS to interleave memory between nodes
>>>so that effectively all memory has the same latency (higher than the normal
>>>latency since one of every two memory accesses would be local, the other remote.
>>
>>Yes I know you can - but if you can take advantage of NUMA it's advantageous to
>>do so.
>>
>>>When I play using an opteron, I run 'em in NUMA mode which is more efficient.
>>>
>>>One other point, just because you use 1/2 of memory does not guarantee you that
>>>all of that is "local memory".  If you run two engines on the same box, it is
>>>possible that their memory will be scattered all over the machine, rather than
>>>each using only local memory.  It is probably something that is not worth
>>>worrying about until you use a parallel search program where it is possible to
>>>control what goes where with a little effort.
>>
>>If your OS support NUMA hints (Windows 2003, XP64, and some Linux kernels do)
>>then it knows what memory is attached to what CPU and will attempt to keep
>>memory allocated to a process local to the CPU it is executing on.
>
>Not very well without help.  For example, linux uses the "fault-in/allocate"
>model.  Which means that if you have any memory that should be used mainly by a
>process on a specific CPU, that memory should not be touched first by any other
>CPU, else it will fault in to the local memory for the node it is running on,
>not the node it is needed on.
>
>In Crafty, I handle this by starting processes/threads, and letting each
>malloc() and then initialize their own local memory so that it faults in on the
>right processor, assuming the O/S doesn't let the process migrate to another
>CPU.  On Linux I even use the processor affinity stuff to prevent a process from
>bouncing between processors.
>
>I do this for the local TREE blocks in Crafty, since I want that data to be
>"close".

Yes I agree it's much better for the application to be aware of these issues and
use processor affinity if possible.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.