Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: And now something completely different: Multithreading

Author: Robert Hyatt
Date: 12:36:17 12/01/03
On December 01, 2003 at 13:31:19, Tony Werten wrote:

>On December 01, 2003 at 11:34:24, Robert Hyatt wrote:
>
>>On December 01, 2003 at 11:06:08, Tony Werten wrote:
>>
>>>On December 01, 2003 at 10:46:00, Robert Hyatt wrote:
>>>
>>>>On December 01, 2003 at 02:41:28, Tony Werten wrote:
>>>>
>>>>>When my masterthread is spinning, waiting for results from it's workers threads,
>>>>>how do I keep it from burning CPU time ?
>>>>
>>>>You have a couple of choices.  But why is it important?  You have two cpus.  One
>>>>thread is doing something useful, the other is spinning.  What is the problem?
>>>>The spinning thread has _zero_ effect on the running thread.  IE in crafty I
>>>>have something like this:
>>>>
>>>>while (!work);
>>>
>>>I don't want to do it the way Crafty does :) ( Nothing personal)
>>>
>>>I have the idea I make the parallel search more effective if I let the
>>>splitpoint exist after splitting.
>>
>>That has nothing to do with sleeping or spinning.
>>In fact, I am not sure what it means.  My "split points" always "exist
>>after splitting" that is why I call them "split points"  :)
>>
>>>
>>>So when I search with 1 thread, I split, create (pick) 2 new threads let the
>>>splitpoint sleep, and have the 2 new threads search.
>>
>>You will have a _huge_ problem.  A deep search will blow memory.  Each thread
>>needs its own stack.  Sleeping threads as well.  I used this in my very
>>first pthread version of crafty, but the overhead for creating threads is _not_
>>zero.  It is very measurable and if you do it thousands of times, you incur a
>>lot of overhead in terms of time, not to mention you can certainly run out of
>>processes easily.  It happened to me.
>
>I'm probably mixing up some terms. I'll try to explain.
>
>At start I want to create 2 * nr_processors threads (all capable of searching
>and splitting, so basicly a search class) and suspend them.
>
>Then when the 1st thread is ready to split, it itself enters a special function
>where it becomes a splitter, feeds (single) moves to the suspended threads,
>unsuspend them and waits for results to return. When a thread does that, the
>splitter hands out a new move etc.
>
>My idea was that when 1 thread is finished, it will go back to the splitpoint to
>help, rather then request another thread to share stuff. I think it will be more
>effective to share close to the root and not at a "random" point, wich will be
>closer to the leafs.

Note that if you avoid splitting near the leafs you kill performance.  And
you need a _bunch_ of threads or else you end up with all but 1 waiting,
which won't do you any good...  That's the _first_ issue you have to solve.

And that is why I start N threads, and never have one of them wait.  When I
reach a split point, I sic other threads on that split point and this thread
then "sics itself" on the same point with a cute data structure that keeps
up with who is doing what and where...  I never have a waiting thread, I
hardly ever have less then N threads doing useful work (ie on a quad, on long
searches, I may lose 1% of one CPU, if that.  On bullet-type searches, I
might lose 10% of one (that is 10% out of 400% total).

You have to keep all processors running all the time, and you have to avoid
searching stuff the serial search would not search.  Those are the two golden
rules of parallel search.  Next you have to eliminate locks as much as
possible.  And finally don't have two threads modifying the same data if you
can help it, this kills cache.

>
>Tony
>
>PS I'm only in the thinking stage.
>
>>
>>
>>
>>
>>>
>>>Tony
>>>
>>>>
>>>>work gets copied into cache.  Until another thread writes to it, I spin on
>>>>local cache, no bus traffic whatsoever, and the instant I get work I start
>>>>to work on it the very next CPU cycle.  If you block, you take a huge
>>>>performance hit to block and then unblock.
>>>>
>>>>>
>>>>>The apifunction sleep() doesn't some threadsafe, suspending the thread and
>>>>>having it resumed by the worker seems overly complicated. ( Accept maybe if I
>>>>>can use a callback function )
>>>>>
>>>>
>>>>use a mutex-type lock that blocks when the lock is already set.  Or if
>>>>using posix threads a "condition variable" will work better as multiple
>>>>threads can wait for the condition to be satisfied and all can proceed
>>>>when it is.  But spinning is _the_ high-performance solution.
>>>>
>>>>IE I certainly hope you use spinlocks rather than blocking locks for
>>>>mutual exclusion/critical sections???
>>>>
>>>>
>>>>>Any thoughts ?
>>>>>
>>>>>Tony
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.