Author: Vincent Diepeveen
Date: 06:03:28 08/31/01
Go up one level in this thread
On August 31, 2001 at 00:20:28, Robert Hyatt wrote: >On August 30, 2001 at 20:45:07, Vincent Diepeveen wrote: > >>On August 30, 2001 at 13:56:53, Scott Gasch wrote: >> >>Some years ago i was faced with the same problems as you >>face now with. >> >>Without doubt the best solution for windows is what Andrew Dados >>suggests, the global thread variables. >> >>One small problem is that your program only works for windows then, >>and monsoon no longer works then under linux. >> >>You can slow down your program, bob is always a fan of that, as overhead >>doesn't need to be 400% of course. Bob estimates it at i think 10% or so? >> >>But by far the simplest solution to get rid of all these problems is >>to get multiprocessing. >> >>Whatever way you search parallel, multiprocessing is faster if you >>want to avoid all the tough global thread variable definitions! >> >>Also at the superb dual AMD SMP chipset there is no longer any disadvantage >>in multiprocessing. >> >>For BSD it even has more advantages, as bsd can only do multiprocessing, >>i heart multithreading might deliver problems under bsd at a multiprocessor >>machine. >> >>My tip go for a 0% overhead, and 0% problem thing and go multiprocessing. >> >>whether you multithread or singlethread, that hashtable you need to share >>anyway, so who cares? >> >> >> >> >> > > >The problem is granularity. With threads, I can store a global variable >and have all threads see it instantly. For message passing, as you are You can do that only if it is a volatile variable. In multiprocessing you also can use shared global variables without problems. In fact that's what i'm doing. >doing, the overhead is significant. Thousands of instructions per message, I am nowhere using message passing in my engine. >which means you can't split the tree where you might one one processor to >search a single node. Threads don't have that overhead. They have exactly >none. I have no idea how you think i implemented this, but i'm simply sharing the tree datastructure. Of course i need to approach that using a pointer, but that's exactly how you do it. The evaluation (slowest part of my program) i can use global arrays without using a slow pointer which also needs to get passed to every single function i use, which would also increase program size considerably in all respects. > > > > >>>I'm moving around data structures in my engine to consolodate things that are >>>going to be needed on a per-cpu basis if/when I go parallel. >>> >>>One such structure is my move stack. It's a big array of moves with a start and >>>end index per ply. So for example it might look like this: >>> >>>start[0] = 0 ... end[0] = 32 [array entries 0..32 hold moves at ply 0] >>>start[1] = 33 ... end[1] = 60 [array entries 33..60 hold moves at ply 1] >>>... >>> >>>Well if more than one thread is searching at once I will need more than one move >>>stack and more than one ply counter. So I kept the same move stack struct and >>>made g_MoveStack an array: >>> >>>MOVE_STACK g_MoveStack[NUM_CPU]; >>> >>>The code to access the move stack goes from this: >>> >>>g_MoveStack.iStart[g_iPly] = 0; >>> >>>to this: >>> >>>g_MoveStack[iCpuId].iStart[g_iPly[iCpuId]] = 0; >>> >>>Talk about a huge impact -- move move generator benchmark literally is 4x >>>slower! These dereferences are damn expensive. There has to be a better way, >>>can one of you assembly gurus give me a clue? >>> >>>Here is a solution I am thinking about -- have a struct per-thread that houses >>>the ply and a pointer to the start of the right move stack entry. Then do >>>something like this: >>> >>>THREAD_INFO *pThreadInfo = &(g_ThreadInfo[iCurrentThreadId]); >>>(pThreadInfo->pMoveStack)->iStart[sThreadInfo->iCpuId] = 0; >>> >>>I bet this is just as slow though... Any advice? >>> >>>Thanks, >>>Scott
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.