Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Multi-threading issues

Author: Robert Hyatt

Date: 14:05:51 11/08/02

Go up one level in this thread


On November 08, 2002 at 10:09:59, Vincent Diepeveen wrote:

>On November 07, 2002 at 17:10:12, Russell Reagan wrote:


Some of the below is simply wrong...


See below for comments.

>
>What you basically need for multithreading is for both
>unix and windows a function to start the thread and
>for both OSes a function to lock global memory variables (volatile int
>for example).
>
>So if you define the lock function in your header file you will
>be able to get away with very little OS specific code in your *.C(PP)
>files, basically just starting and ending threads is what you need
>to worry about.
>
>Questionable is however why you want to write the i/o multithreaded.
>
>What you basically describe here is exactly what i'm rewriting DIEP
>to currently (a second thread for i/o and time management and all
>things except searching). The reason to do so has nothing
>to do with what most might guess. It isn't faster to start with,
>and i'm not sure it works better anyway. I just had to solve more
>problems so far and time management has become more complicated
>too to do, as i need to put everything in the shared memory now,
>which creates extra variables, because previously simply process 0
>was checking after finishing a ply whether it had to play the move
>by then, now i must solve that in a different way (so more variables).
>
>Which by definition is slower.
>
>If my interface for example gives a couple of thousands of commands to
>the engine, i want those get parsed and the search processes stopped
>while those commands get parsed. That idea looks bright and clear, but
>it is not so clear in fact. It makes of course stopping the search
>kind of 'fool proof', but the problems are not so much stopping the
>search, but starting it and all communication between search and i/o.
>
>That foolproof stopping of the search is the only *advantage* i could
>think of actually if we have to speak about an advantage anyway.
>
>It isn't easy to quickly let a search start now. I lose at least like
>0.002 seconds to that now (guessed time to wake up a process/thread
>in windows), a loss which wasn't there when i was not multithreaded
>with the i/o. Process0 just continued on forever back then.
>
>The alternative, which is in fact way faster and easier, is a polling
>model. For example each 10000 nodes or so you check the whether there
>is i/o or a timeout of the search.

This doesn't have to be "way faster".  On the Cray, I couldn't do this reliably,
so the only choice was an I/O thread.  The I/O thread read something, put it in
a buffer, and set a global flag that the search spotted and then executed the
command.
No slower or faster whatsoever, done correctly.

The advantage is cleanliness.  The disadvantage is portability.


>
>That's by far faster than an additional thread that both polls the i/o
>and as well is checking for timeouts and stops search.

Doesn't have to be _any_ faster or slower.  You can simply set a global variable
when some condition happens and the search can spot that at the top of the next
node
processing.  Which is where you would normally notice you have input in the
polling
version.




>
>In fact the thread has to be woken up by the system and doing that is
>of course not faster than a processor clock or 100 which you lose
>to calling a function like for example 'GetTickCount()' in windows (less
>most likely).
>

It doesn't count much compared to the overhead to do the actual read() call
however, so that is pretty much moot...


>In linux i don't know how many clocks for example GetTimeOfDay eats.
>
>Probably plenty fast too, a few hundreds of processor clocks are not
>really comparable to the huge loss of waking up a process/thread.

This is not a horrible cost at all.  but this would only happen very rarely when
someone
types a command or an "alarm" fires, so it would not be measurable.  It would be
far
worse if every lock() were to block you and then you have to unblock later.
Doing that
a few thousand times per second will kill things.  Doing it once every 2 minutes
won't
change a thing.


>
>Checking for i/o seems more expensive, but it doesn't take away that
>the grass is always greener somewhere else, so probably no one will stop
>you making another thread.
>
>Making your engine SMP is of course a good idea, but really you perhaps
>don't want to multithread at all there, but go for the much
>easier multiprocessing :)
>
>Just start 2 processes instead of 1, let one process do as if it is
>single cpu and play the game with that. the other process just fills the
>hashtable at the same time.
>
>A good programmer makes that form of SMP within a weekend or
>so (i needed 5 hours in fact to experiment with that up
>to 20 processors), so perhaps you'll manage within a week.
>
>That's a peanut time investment compared to a full SMP algorithm like
>Crafty, Sjeng, not to mention what DIEP uses.
>
>Crafty and DIEP needed years to bugfix,

Don't know where you have been, but the parallel search in Crafty took two weeks
to write.  It may well _still_ have a bug or two that I haven't seen show up, of
course.
But the idea is to start off with something that has a chance of producing good
performance.  Just sharing hash tables isn't going to offer much.





 >and Sjeng still crashes here
>if i toy a lot with it. I initially developed on a single cpu machine,
>but i would have never gotten it to work without testing on Bob's machine
>up to 4 processors. Even then it took another 1.5 years to bugfix a
>race condition which happened (also at a dual processor i had in the
>meantime) once each 1 billion nodes or so.
>
>My current implementation where all parallel functions can get called
>at the same time; so processors can get aborted at several points in
>the search tree while new splits simultaneously occur at other points
>in the tree and processors that have finished searching a position look
>for themselves without hurting other processes for a new splitpoint in
>the tree.
>
>Without being capable of testing fulltime for a month or 2-3 at up to
>20 processors, i would not have been capable of creating that last few
>months. Of course this is one of the big advantages of supercomputers
>with so many processors like the TERAS (1024 processor SGI R14000s see
>http://www.nwo.nl/nwohome.nsf/pages/ACPP_4X6R5C for some pictures of
>it and for a description in english and more info and pictures:
>  http://www.sara.nl/userinfo/teras/description/index.html
>),
>there are usually at a testpartition a few idle to get used for testing,
>which is a very crucial thing to do when multiprocessing.
>
>But still i must do some additional tests to fine tune things so that
>it runs great on n processors (where n is considerable bigger than 32
>like 500; testpartition only goes till 32 processors).
>
>>I would like to add multi-threading into my engine, to make it interface more
>>easily with winboard, and eventually play around with supporting SMP. I have a
>>few questions about multi-threading issues however.
>>
>>I would like to be portable, but that seems rather difficult. I asked about
>>pthreads, and someone said they aren't really that portable, and recommended
>>using SDL (www.libsdl.org). What is the best bet if I eventually want to support
>>linux/unix in addition to windows?
>>
>>Secondly, I like the Interlocked functions on Windows. Is there anything similar
>>on Linux or Unix?
>>
>>Lastly, is there any way to write your own routines to make your threads work
>>correctly? Or is it something that has to be implemented at the OS level? What's
>>wrong with passing messages between threads using simple atomic operations? For
>>example, if I have an input thread, and I want to signal my search thread that
>>it should stop, why can't I set a global variable that the search thread reads
>>from? It seems like it would work if you did message passing using one way
>>variables (like thread1 only writes to var1 and thread2 only reads var1, thread2
>>only writes to var2, thread1 reads var2, etc.). In Crafty there looks like some
>>code that isn't OS specific, but I don't see how it avoids synchronization
>>problems.
>>
>>Multi-threading makes me feel quite insecure that my program is running
>>bug-free. It seems like you have to be an expert on the subject before you can
>>even begin to spot potential bugs. I guess it's best to stick to as few shared
>>variables as possible?
>>
>>In summary, I'm lost...
>>
>>Russell



This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.