Author: Jay Urbanski
Date: 19:20:14 02/07/06
Go up one level in this thread
On February 07, 2006 at 11:03:14, Robert Hyatt wrote: >On February 06, 2006 at 23:10:08, Jay Urbanski wrote: > >>On February 06, 2006 at 12:12:06, Robert Hyatt wrote: >> >>>On February 05, 2006 at 23:49:39, Jay Urbanski wrote: >>> >>>>On February 05, 2006 at 06:10:55, Vasik Rajlich wrote: >>>> >>>>>On February 04, 2006 at 12:19:33, dhanial wrote: >>>>> >>>>>>What is the best hash table settings for rybka for 10 to 15 min games and for 1 >>>>>>to 3 min games. >>>>> >>>>>For all time controls, as high as possible, provided that there is no swapping >>>>>to disk. >>>>> >>>>>Vas >>>> >>>>This is not actually true on all systems. For instance, if you have dual >>>>processor Opteron it would behoove you to keep the hash size to under half your >>>>physical memory at least, because Opteron systems are NUMA and access to memory >>>>attached to the non-local CPU is higher than latency to memory attached to the >>>>CPU your process is running on. It's about 60-80ns latency for local memory and >>>>about 100-120ns for 1-hop away. >>> >>>Not all dual opterons are NUMA. And not all NUMA-capable opterons really look >>>like NUMA either. You can configure the BIOS to interleave memory between nodes >>>so that effectively all memory has the same latency (higher than the normal >>>latency since one of every two memory accesses would be local, the other remote. >> >>Yes I know you can - but if you can take advantage of NUMA it's advantageous to >>do so. >> >>>When I play using an opteron, I run 'em in NUMA mode which is more efficient. >>> >>>One other point, just because you use 1/2 of memory does not guarantee you that >>>all of that is "local memory". If you run two engines on the same box, it is >>>possible that their memory will be scattered all over the machine, rather than >>>each using only local memory. It is probably something that is not worth >>>worrying about until you use a parallel search program where it is possible to >>>control what goes where with a little effort. >> >>If your OS support NUMA hints (Windows 2003, XP64, and some Linux kernels do) >>then it knows what memory is attached to what CPU and will attempt to keep >>memory allocated to a process local to the CPU it is executing on. > >Not very well without help. For example, linux uses the "fault-in/allocate" >model. Which means that if you have any memory that should be used mainly by a >process on a specific CPU, that memory should not be touched first by any other >CPU, else it will fault in to the local memory for the node it is running on, >not the node it is needed on. > >In Crafty, I handle this by starting processes/threads, and letting each >malloc() and then initialize their own local memory so that it faults in on the >right processor, assuming the O/S doesn't let the process migrate to another >CPU. On Linux I even use the processor affinity stuff to prevent a process from >bouncing between processors. > >I do this for the local TREE blocks in Crafty, since I want that data to be >"close". Yes I agree it's much better for the application to be aware of these issues and use processor affinity if possible.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.