Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: How to make your movegen 4x slower in 1 easy step

Author: Vincent Diepeveen

Date: 06:03:28 08/31/01

On August 31, 2001 at 00:20:28, Robert Hyatt wrote:

>On August 30, 2001 at 20:45:07, Vincent Diepeveen wrote:
>
>>On August 30, 2001 at 13:56:53, Scott Gasch wrote:
>>
>>Some years ago i was faced with the same problems as you
>>face now with.
>>
>>Without doubt the best solution for windows is what Andrew Dados
>>suggests, the global thread variables.
>>
>>One small problem is that your program only works for windows then,
>>and monsoon no longer works then under linux.
>>
>>You can slow down your program, bob is always a fan of that, as overhead
>>doesn't need to be 400% of course. Bob estimates it at i think 10% or so?
>>
>>But by far the simplest solution to get rid of all these problems is
>>to get multiprocessing.
>>
>>Whatever way you search parallel, multiprocessing is faster if you
>>want to avoid all the tough global thread variable definitions!
>>
>>Also at the superb dual AMD SMP chipset there is no longer any disadvantage
>>in multiprocessing.
>>
>>For BSD it even has more advantages, as bsd can only do multiprocessing,
>>i heart multithreading might deliver problems under bsd at a multiprocessor
>>machine.
>>
>>My tip go for a 0% overhead, and 0% problem thing and go multiprocessing.
>>
>>whether you multithread or singlethread, that hashtable you need to share
>>anyway, so who cares?
>>
>>
>>
>>
>>
>
>
>The problem is granularity.  With threads, I can store a global variable
>and have all threads see it instantly.  For message passing, as you are

You can do that only if it is a volatile variable. In multiprocessing
you also can use shared global variables without problems. In fact
that's what i'm doing.

>doing, the overhead is significant.  Thousands of instructions per message,

I am nowhere using message passing in my engine.

>which means you can't split the tree where you might one one processor to
>search a single node.  Threads don't have that overhead.  They have exactly
>none.

I have no idea how you think i implemented this, but i'm simply sharing
the tree datastructure.

Of course i need to approach that using a pointer, but that's exactly
how you do it.

The evaluation (slowest part of my program) i can use global arrays
without using a slow pointer which also needs to get passed to every
single function i use, which would also increase program size considerably
in all respects.

>
>
>
>
>>>I'm moving around data structures in my engine to consolodate things that are
>>>going to be needed on a per-cpu basis if/when I go parallel.
>>>
>>>One such structure is my move stack.  It's a big array of moves with a start and
>>>end index per ply.  So for example it might look like this:
>>>
>>>start[0] = 0  ...  end[0] = 32  [array entries 0..32 hold moves at ply 0]
>>>start[1] = 33 ...  end[1] = 60  [array entries 33..60 hold moves at ply 1]
>>>...
>>>
>>>Well if more than one thread is searching at once I will need more than one move
>>>stack and more than one ply counter.  So I kept the same move stack struct and
>>>made g_MoveStack an array:
>>>
>>>MOVE_STACK g_MoveStack[NUM_CPU];
>>>
>>>The code to access the move stack goes from this:
>>>
>>>g_MoveStack.iStart[g_iPly] = 0;
>>>
>>>to this:
>>>
>>>g_MoveStack[iCpuId].iStart[g_iPly[iCpuId]] = 0;
>>>
>>>Talk about a huge impact -- move move generator benchmark literally is 4x
>>>slower!  These dereferences are damn expensive.  There has to be a better way,
>>>can one of you assembly gurus give me a clue?
>>>
>>>Here is a solution I am thinking about -- have a struct per-thread that houses
>>>the ply and a pointer to the start of the right move stack entry.  Then do
>>>something like this:
>>>
>>>THREAD_INFO *pThreadInfo = &(g_ThreadInfo[iCurrentThreadId]);
>>>(pThreadInfo->pMoveStack)->iStart[sThreadInfo->iCpuId] = 0;
>>>
>>>I bet this is just as slow though... Any advice?
>>>
>>>Thanks,
>>>Scott

Re: How to make your movegen 4x slower in 1 easy step Robert Hyatt 07:18:13 08/31/01
- Why multiprocessing is hell faster than multithreading Vincent Diepeveen 03:59:01 09/01/01
  - Re: Why multiprocessing is hell faster than multithreading Robert Hyatt 07:06:24 09/01/01
- Re: How to make your movegen 4x slower in 1 easy step (more) Robert Hyatt 07:21:02 08/31/01

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.