Author: Robert Hyatt
Date: 08:00:13 09/02/03
Go up one level in this thread
On September 01, 2003 at 06:09:48, Mridul Muralidharan wrote: >Hi Jeremiah, > > If you want crafty to get to work with a decent speedup on a 16 or 32 CPU >cc-numa where you have say 2 - 4 processors per node with a significant >inter-node latency (like most higher cpu numa boxes ?!) , you will have to >ensure that the way you split , memory usage , etc is optimal - you dont want to >access a hash entry in proc 0 from proc 32 when the latency wil be in >milliseconds !!! > >Hope you are appreciating the real world problems - not theoretical issues. >If you actually work on these boxes , you will appreciate the problems faced by >these developers more - using threads on such a box for parallelism , urggh !! > >Regards >Mridul > >PS : Forget getting crafty to work on the 500 CPU beast that Vincent is working >on without a total crafty rewrite ! The horrors Vincent must be facing is >unimaginable - the once he has already mentioned in this forum and I'm sure , >the more horrible ones he may not have ! > A total rewrite is _not_ needed. The search is already designed to work on _any_ type of parallel machine. The issue is allocating data structures on the right processor. This will _not_ be a major change. I currently use an array of split blocks. What I need is an array of pointers to split blocks, so that each processor can allocate a few split blocks in its local memory. Then the block allocator simply has to prefer split blocks in the processor that will be using them, when trying to allocate a split block for a parallel thread to use. It is something I have on my list of things to do, but the real issue is "how to allocate _local_ memory" reliably, without wrecking things on non- NUMA machines? That's why I haven't looked at this lately. I looked at it a year ago on a NUMA alpha box, but unfortunately the code was lost when the disk on that machine crashed with no backups. I got a new disk, but the source changes were lost. This was written around Compaq's UPC compiler.. >On August 30, 2003 at 10:40:03, Vincent Diepeveen wrote: > >>On August 29, 2003 at 23:41:32, Jeremiah Penery wrote: >> >>>On August 29, 2003 at 18:40:23, Mridul Muralidharan wrote: >>> >>>>I'm not sure of what/why Prof. Bob Hyatt may have made those comments. But to >>>>get a program like crafty to work properly in a numa machine will not be trivial >>>>- and it wont be tweaks , but something more. >>> >>>All multi-CPU Opteron machines are NUMA. Crafty will work just fine in those. >>>It will not be theoretically optimal, but that also depends on the OS to help >>>with NUMA issues. >> >>The OS has to do very little for chessprograms. Just keep scheduling the same >>process at the same cpu and physically allocating local memory at that cpu's >>RAM. >> >>Of course for a lot of other services the OS has to do a lot different, yet in >>chessprograms we do not need it as most of us, except for example cilkchess, >>write their parallellism at a very low level. >> >>>>Duals , etc count as SMP machine not cc-numa which I was refering to. >>>Dual Opterons are NUMA. >> >>And soon all duals that we can afford will be. >> >>Best regards, >>Vincent
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.