Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: But, Re: Questions re P4 3.03 with HT ??

Author: Matt Taylor

Date: 20:11:36 12/10/02

On December 10, 2002 at 22:54:45, Robert Hyatt wrote:

>On December 10, 2002 at 21:19:18, Matt Taylor wrote:
>
>>On December 10, 2002 at 21:13:28, Robert Hyatt wrote:
>>
>>>On December 10, 2002 at 20:33:34, Jeremiah Penery wrote:
>>>
>>>>On December 10, 2002 at 20:18:16, Robert Hyatt wrote:
>>>>
>>>>>On December 10, 2002 at 20:12:06, Jeremiah Penery wrote:
>>>>>
>>>>>>On December 10, 2002 at 20:00:11, Robert Hyatt wrote:
>>>>>>
>>>>>>>On December 10, 2002 at 16:43:29, Matt Taylor wrote:
>>>>>>>
>>>>>>>>They said that HT allows -concurrent- scheduling of threads, but the threads
>>>>>>>>obviously cannot make use of the same execution resources. If this is correct,
>>>>>>>>one thread would be spinning (consuming bandwidth to the L1 cache) while the
>>>>>>>>other thread was doing real work.
>>>>>>>
>>>>>>>Again, think about what you just said, which is impossible to happen.  If one
>>>>>>>thread is smoking the L1/L2 cache, then it is not waiting for _anything_ and
>>>>>>>once it is scheduled it will execute until the cpu decides to flip to the other
>>>>>>>thread.  Or until that thread does a pause.  Whichever comes first.
>>>>>>
>>>>>>The point is that the spinning thread blocks no execution units.  The processor
>>>>>>can spin the idle thread all it wants, why should that stop it from scheduling
>>>>>>the second thread, which _will_ use the execution units, to run at the same
>>>>>>time?
>>>>>
>>>>>
>>>>>I don't follow.  The "spinning thread" completely fills the integer pipe...
>>>>
>>>>Processors have more than one integer pipe, and I'm sure that a spinning thread
>>>>doesn't fill more than one.  In a P4, which has dual-pumped ALUs, a spinning
>>>>thread wouldn't even block a single pipe.  That is, if the scheduler were smart
>>>>enough to schedule other thread(s) to fill that unit.
>>>
>>>Somehow we are not on the same page. A single tight compute-bound loop can
>>>_completely_ fill one pipe by itself with _no_ problems.  The micro-ops
>>>will simply stuff that pipe totally as every branch will be predicted
>>>correctly...
>>>
>>>And if that thread is sucking up the cpu, the _other_ thread is going to
>>>be hindered since it can probably use _everything_ in the CPU when it is
>>>running...
>>>
>>>
>>>>
>>>>>The cpu doesn't execute two threads at a time, it flips and flops back and
>>>>>forth between them.  The spinning thread will _never_ give up control and has
>>>>>to be either preempted by the cpu, or else it has to do a pause, as explained
>>>>>in the intel white-paper on the subject...
>>>>>
>>>>>Otherwise the pause would _not_ be needed...
>>>>
>>>>What's the point of hyper-threading if two threads don't run at the same time?
>>>>Yeah, sure, you can execute while one thread waits on memory or something, but
>>>>it's certainly not the most efficient use.  All the documentation I've seen
>>>>suggests that if one thread is using, say, half the integer pipes, that another
>>>>thread can be scheduled concurrently to use the other half of the pipes.
>>>
>>>
>>>
>>>What is the point in an operating system for executing two processes at the
>>>same time?  Because one blocks and the other uses those unused cycles.  That
>>>is the _only_ point of running more than one process at a time.  That is the
>>>only point for hyper-threading also.  It has just moved a bit of the process
>>>scheduling down into the CPU.  The OS feeds the CPU two candidate processes
>>>to "interleave" and the CPU does that at the hardware level, more efficiently.
>>>
>>>As far as sharing pipes, that can happen.  But if one thread is burning one
>>>pipe up doing useless work, that is lost cycles that the other thread can't
>>>get to.  Which is _the_ point for the "pause" instruction...
>>
>>The integer pipe feeds into 5 integer execution units which can be accessed
>>concurrently each cycle. However, a spin-wait loop will only be able to use 1
>>unit because of register dependenies.
>
>
>Not necessarily.  Look at "ThreadWait()" in Crafty.  It is a more complicated
>"spin wait" that is testing several things in the same loop...  but
>irregardless, of whether it is one execution busy or two or three, it does
>_not_ matter.  That is one execution unit that the other thread can't get
>to, which is the point for the "pause" instruction.
>
>Otherwise the "pause" is pointless.  Why do you think they implemented that?
>And why do you think they wrote a 7-8 page paper describing how to do
>spinlocks and spinwaits using the pause instruction?

Here are the first two paragraphs on the pause instruction from the P4 manual. I
did not continue past that because the manual digresses from function and talks
about compatibility, exceptions, pseudo-code, etc.

IA-32 Intel Architecture Software Developer's Manual Vol. 2: Instruction Set
Reference
Order 245471-006

Page 586/966: Pause -- Spin Loop Hint

Improves the performance of spin-wait loops. When executing a "spin-wait loop,"
a Pentium 4 or Intel Xeon processor suffers a severe performance penalty when
exiting the loop because it detects a possible memory order violation. The pause
instruction provides a hint to the processor that the code sequence is a
spin-wait loop. The processor uses this hint to avoid the memory order violation
in most situations, which greatly improves processor performance. For this
reason, it is recommended that a pause instruction be placed in all spin-wait
loops.

An additional function of the pause instruction is to reduce the power consumed
by a Pentium 4 processor while executing a spin loop. The Pentium 4 processor
can execute a spin-wait loop extremely quickly, causing the processor to consume
a lot of power while it waits for the resource it is spinning on to become
available. Inserting a pause instruction in a spin-wait loop greatly reduces the
processor's power consumption...

Re: But, Re: Questions re P4 3.03 with HT ?? Eugene Nalimov 21:47:31 12/10/02
Re: But, Re: Questions re P4 3.03 with HT ?? Robert Hyatt 21:28:09 12/10/02
- Re: But, Re: Questions re P4 3.03 with HT ?? Matt Taylor 23:34:33 12/10/02
  - advantages versus disadvantage P4 Vincent Diepeveen 07:15:16 12/12/02
    - Re: advantages versus disadvantage P4 Matt Taylor 18:16:55 12/13/02
      - Re: advantages versus disadvantage P4 Robert Hyatt 20:05:39 12/13/02
        
        Re: advantages versus disadvantage P4 Matt Taylor 22:09:12 12/13/02
        
        Re: advantages versus disadvantage P4 Robert Hyatt 10:55:00 12/14/02
        
        Re: advantages versus disadvantage P4 Eugene Nalimov 22:38:31 12/13/02
        
        Re: advantages versus disadvantage P4 Robert Hyatt 10:52:23 12/14/02
    - Re: advantages versus disadvantage P4 Robert Hyatt 11:04:49 12/12/02
  - Re: But, Re: Questions re P4 3.03 with HT ?? Robert Hyatt 06:57:23 12/11/02

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.