Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How to make your movegen 4x slower in 1 easy step

Author: Robert Hyatt

Date: 11:41:59 08/30/01

Go up one level in this thread


On August 30, 2001 at 13:56:53, Scott Gasch wrote:

>I'm moving around data structures in my engine to consolodate things that are
>going to be needed on a per-cpu basis if/when I go parallel.
>
>One such structure is my move stack.  It's a big array of moves with a start and
>end index per ply.  So for example it might look like this:
>
>start[0] = 0  ...  end[0] = 32  [array entries 0..32 hold moves at ply 0]
>start[1] = 33 ...  end[1] = 60  [array entries 33..60 hold moves at ply 1]
>...
>
>Well if more than one thread is searching at once I will need more than one move
>stack and more than one ply counter.  So I kept the same move stack struct and
>made g_MoveStack an array:
>
>MOVE_STACK g_MoveStack[NUM_CPU];
>
>The code to access the move stack goes from this:
>
>g_MoveStack.iStart[g_iPly] = 0;
>
>to this:
>
>g_MoveStack[iCpuId].iStart[g_iPly[iCpuId]] = 0;
>
>Talk about a huge impact -- move move generator benchmark literally is 4x
>slower!  These dereferences are damn expensive.  There has to be a better way,
>can one of you assembly gurus give me a clue?
>
>Here is a solution I am thinking about -- have a struct per-thread that houses
>the ply and a pointer to the start of the right move stack entry.  Then do
>something like this:
>
>THREAD_INFO *pThreadInfo = &(g_ThreadInfo[iCurrentThreadId]);
>(pThreadInfo->pMoveStack)->iStart[sThreadInfo->iCpuId] = 0;
>
>I bet this is just as slow though... Any advice?
>
>Thanks,
>Scott


I do it about like you  are.  I have an array of pointers declared as
TREE *thread[NCPUS];

Tree is a large structure with move stack, all the bitboards, everything needed
in a specific search state.  There is then a pointer to one of these for each
thread that is active.  I pass around a pointer TREE *tree, which is just a
pointer to this thread's local search space block.  If you look at all the
procedures in Crafty, you will note that they all include a variable tree that
is declared as above and passed in as a formal parameter, except for those
few rare procedures that don't need to access the tree state data.

I only saw a slow-down of a few percent, not anything like what you are
seeing.  Probably that double index operation is costing too many register
spills.




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.