Author: Robert Hyatt
Date: 07:18:13 08/31/01
Go up one level in this thread
On August 31, 2001 at 09:03:28, Vincent Diepeveen wrote: >On August 31, 2001 at 00:20:28, Robert Hyatt wrote: > >>On August 30, 2001 at 20:45:07, Vincent Diepeveen wrote: >> >>>On August 30, 2001 at 13:56:53, Scott Gasch wrote: >>> >>>Some years ago i was faced with the same problems as you >>>face now with. >>> >>>Without doubt the best solution for windows is what Andrew Dados >>>suggests, the global thread variables. >>> >>>One small problem is that your program only works for windows then, >>>and monsoon no longer works then under linux. >>> >>>You can slow down your program, bob is always a fan of that, as overhead >>>doesn't need to be 400% of course. Bob estimates it at i think 10% or so? >>> >>>But by far the simplest solution to get rid of all these problems is >>>to get multiprocessing. >>> >>>Whatever way you search parallel, multiprocessing is faster if you >>>want to avoid all the tough global thread variable definitions! >>> >>>Also at the superb dual AMD SMP chipset there is no longer any disadvantage >>>in multiprocessing. >>> >>>For BSD it even has more advantages, as bsd can only do multiprocessing, >>>i heart multithreading might deliver problems under bsd at a multiprocessor >>>machine. >>> >>>My tip go for a 0% overhead, and 0% problem thing and go multiprocessing. >>> >>>whether you multithread or singlethread, that hashtable you need to share >>>anyway, so who cares? >>> >>> >>> >>> >>> >> >> >>The problem is granularity. With threads, I can store a global variable >>and have all threads see it instantly. For message passing, as you are > >You can do that only if it is a volatile variable. In multiprocessing >you also can use shared global variables without problems. In fact >that's what i'm doing. They had better be volatile no matter which way you do the parallel search, or it is broken. > >>doing, the overhead is significant. Thousands of instructions per message, > >I am nowhere using message passing in my engine. > >>which means you can't split the tree where you might one one processor to >>search a single node. Threads don't have that overhead. They have exactly >>none. > >I have no idea how you think i implemented this, but i'm simply sharing >the tree datastructure. > >Of course i need to approach that using a pointer, but that's exactly >how you do it. Then your point would be? You claim your approach is "better" yet you have to pass a pointer around, which is exactly what I am doing. And this is somehow "better"?? > >The evaluation (slowest part of my program) i can use global arrays >without using a slow pointer which also needs to get passed to every >single function i use, which would also increase program size considerably >in all respects. I don't see how that makes it "better". It does make it "bigger" since all your code and non-shared data is duplicated in memory N times, once for each process you start. > >> >> >> >> >>>>I'm moving around data structures in my engine to consolodate things that are >>>>going to be needed on a per-cpu basis if/when I go parallel. >>>> >>>>One such structure is my move stack. It's a big array of moves with a start and >>>>end index per ply. So for example it might look like this: >>>> >>>>start[0] = 0 ... end[0] = 32 [array entries 0..32 hold moves at ply 0] >>>>start[1] = 33 ... end[1] = 60 [array entries 33..60 hold moves at ply 1] >>>>... >>>> >>>>Well if more than one thread is searching at once I will need more than one move >>>>stack and more than one ply counter. So I kept the same move stack struct and >>>>made g_MoveStack an array: >>>> >>>>MOVE_STACK g_MoveStack[NUM_CPU]; >>>> >>>>The code to access the move stack goes from this: >>>> >>>>g_MoveStack.iStart[g_iPly] = 0; >>>> >>>>to this: >>>> >>>>g_MoveStack[iCpuId].iStart[g_iPly[iCpuId]] = 0; >>>> >>>>Talk about a huge impact -- move move generator benchmark literally is 4x >>>>slower! These dereferences are damn expensive. There has to be a better way, >>>>can one of you assembly gurus give me a clue? >>>> >>>>Here is a solution I am thinking about -- have a struct per-thread that houses >>>>the ply and a pointer to the start of the right move stack entry. Then do >>>>something like this: >>>> >>>>THREAD_INFO *pThreadInfo = &(g_ThreadInfo[iCurrentThreadId]); >>>>(pThreadInfo->pMoveStack)->iStart[sThreadInfo->iCpuId] = 0; >>>> >>>>I bet this is just as slow though... Any advice? >>>> >>>>Thanks, >>>>Scott
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.