Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Odd hyperthreading behavior

Author: Robert Hyatt

Date: 13:28:45 10/07/03

Go up one level in this thread


On October 07, 2003 at 14:14:51, Tom Kerrigan wrote:

>On October 06, 2003 at 09:56:59, Robert Hyatt wrote:
>
>>On October 06, 2003 at 06:15:58, Tom Kerrigan wrote:
>>
>>>On October 04, 2003 at 23:54:43, Jeremiah Penery wrote:
>>>
>>>>On October 04, 2003 at 22:27:45, Tom Kerrigan wrote:
>>>>
>>>>>You're right, the #s do work out pretty well, but that must mean both threads
>>>>>are bouncing around different logical processors frequently, otherwise there'd
>>>>>be a disparity in the node counts of each thread. This doesn't seem likely to me
>>>>>but I guess it's not impossible. I wonder if there's a way to keep track of
>>>>>which thread is running on which processor.
>>>>
>>>>The threads really do hop around the processors.  You wrote previously that a
>>>>process should stay on the same processor until 'something odd happens' - in
>>>>other words, until something pre-empts the process.  It's very frequent that a
>>>>part of the kernel, or a service, or something else with default high+ priority
>>>>gets scheduled for some miniscule fraction of time, so it will bump your process
>>>>to another processor.  If you raise your program's priority to realtime and run
>>>>two threads, you should be able to get more stable results in this regard.
>>>
>>>Wouldn't this result in a lot of cache thrashing that would slow the program
>>>down? If I run two copies of my program simultaneously they each run exactly as
>>>fast as one copy. (Which is interesting because both copies use hash tables--I
>>>suppose the hash table probes are infrequent enough to not cause many FSB
>>>collisions.)
>>>
>>>-Tom
>>
>>
>>There are multiple issues.  The APIC chooses which CPU to send an
>>interrupt to (it appears to send all interrupts to CPU#0 on my dual
>>2.8 running 2.4.21 linux.)  An interrupt is definitely going to dislodge
>>a process for a brief period of time.
>
>What do you mean by "dislodge"? Push it onto the other processor? Because this
>must take some amount of work.


Several choices.  While the interrupt handler is running, the process is
not.  It could be moved to another processor although this would seem like
a bad design.  however, when an interrupt occurs, most likely the process
state (overall) changes since it is likely a previously blocked process is
now ready to go, and it _could_ be dispatched on this CPU if it is of a higher
priority than the currently running process.





> Whatever schedulers are running on each processor
>have to communicate with each other about which processes to stop/start. Why
>move it to the other processor anyway? Something is likely already running on
>that processor so by the time the process gets a timeslice, it might as well
>have just waited for the stupid interrupt to be handled anyway. So maybe this is
>what happens and there's a good reason for it that I don't know about but right
>now what you're suggesting sounds like it's more work for no gain.
>
>-Tom


I'm not sure why bouncing has to happen.  I have run _lots_ of linux kernels
with various "affinity" fixes.  Some work.  Some are sluggish.  But most bounce
no matter what.  In linux run xosview with one process on a dual or quad
machine.  Depending on the kernel the active process will stick mainly on one
or it will bounch equally over all 4, which seems ridiculous.  If you try to
make it "sticky" that hurts in one direction, if you don't, it bounces too
easily and blows cache out.  That hurts in another direction.

BTW processors really don't have to communicate with each other to run
processes.  If a processor is idle, it is simply waiting.  When another
processor unblocks a process, it will "ring the doorbell" of an idle processor
telling it to run the scheduler loop and grab the newly-unblocked process.  It
doesn't always run there however as a process on another cpu can block and
cause that processor to run the scheduler loop and pick up the newly unblocked
processor.  Linux has had some fixes for this to always try to run a process
on the same cpu it used the last time, to try to re-use cache contents.  But
it doesn't seem to work well on any system I have seen.  The less a process
bounces, the more sluggish things like mouse movement seems to be.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.