Author: Vincent Diepeveen
Date: 05:59:52 08/31/01
Go up one level in this thread
On August 31, 2001 at 01:58:32, Bruce Moreland wrote: >On August 30, 2001 at 20:45:07, Vincent Diepeveen wrote: > >>On August 30, 2001 at 13:56:53, Scott Gasch wrote: >> >>Some years ago i was faced with the same problems as you >>face now with. >> >>Without doubt the best solution for windows is what Andrew Dados >>suggests, the global thread variables. > >I looked into this. Actually, I looked into this in your house, while you were >asleep, after you suggested this while we were sitting in that computer shop all >day long. >Thread-based variables in MSVC suck. Whenever you access one of them, it does >very ugly looking stuff, which can't possibly be better than passing a pointer >all over everywhere. The things use a segment register (fs). That has to be >terrible. >Needless to say, I handle this by passing a pointer all over everywhere, as in >Gerbil. Aha so my solution to do multiprocessing was even smarter as i thought! >bruce > >> >>One small problem is that your program only works for windows then, >>and monsoon no longer works then under linux. >> >>You can slow down your program, bob is always a fan of that, as overhead >>doesn't need to be 400% of course. Bob estimates it at i think 10% or so? >> >>But by far the simplest solution to get rid of all these problems is >>to get multiprocessing. >> >>Whatever way you search parallel, multiprocessing is faster if you >>want to avoid all the tough global thread variable definitions! >> >>Also at the superb dual AMD SMP chipset there is no longer any disadvantage >>in multiprocessing. >> >>For BSD it even has more advantages, as bsd can only do multiprocessing, >>i heart multithreading might deliver problems under bsd at a multiprocessor >>machine. >> >>My tip go for a 0% overhead, and 0% problem thing and go multiprocessing. >> >>whether you multithread or singlethread, that hashtable you need to share >>anyway, so who cares? >> >> >> >> >> >>>I'm moving around data structures in my engine to consolodate things that are >>>going to be needed on a per-cpu basis if/when I go parallel. >>> >>>One such structure is my move stack. It's a big array of moves with a start and >>>end index per ply. So for example it might look like this: >>> >>>start[0] = 0 ... end[0] = 32 [array entries 0..32 hold moves at ply 0] >>>start[1] = 33 ... end[1] = 60 [array entries 33..60 hold moves at ply 1] >>>... >>> >>>Well if more than one thread is searching at once I will need more than one move >>>stack and more than one ply counter. So I kept the same move stack struct and >>>made g_MoveStack an array: >>> >>>MOVE_STACK g_MoveStack[NUM_CPU]; >>> >>>The code to access the move stack goes from this: >>> >>>g_MoveStack.iStart[g_iPly] = 0; >>> >>>to this: >>> >>>g_MoveStack[iCpuId].iStart[g_iPly[iCpuId]] = 0; >>> >>>Talk about a huge impact -- move move generator benchmark literally is 4x >>>slower! These dereferences are damn expensive. There has to be a better way, >>>can one of you assembly gurus give me a clue? >>> >>>Here is a solution I am thinking about -- have a struct per-thread that houses >>>the ply and a pointer to the start of the right move stack entry. Then do >>>something like this: >>> >>>THREAD_INFO *pThreadInfo = &(g_ThreadInfo[iCurrentThreadId]); >>>(pThreadInfo->pMoveStack)->iStart[sThreadInfo->iCpuId] = 0; >>> >>>I bet this is just as slow though... Any advice? >>> >>>Thanks, >>>Scott
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.