Author: Robert Hyatt
Date: 15:20:51 12/05/02
Go up one level in this thread
On December 05, 2002 at 15:47:38, Matt Taylor wrote: >On December 05, 2002 at 10:30:08, Robert Hyatt wrote: > >>On December 05, 2002 at 01:58:07, Matt Taylor wrote: >> >>>On December 04, 2002 at 23:53:50, Robert Hyatt wrote: >>> >>>>On December 04, 2002 at 12:54:05, Matt Taylor wrote: >>>> >>>>>On December 03, 2002 at 23:10:31, Robert Hyatt wrote: >>>>> >>>>>>On December 02, 2002 at 23:59:51, Matt Taylor wrote: >>>>>> >>>>>>>>How can you use the hlt instruction? It's privileged, and you're in ring 3. >>>>>>> >>>>>>>Nevermind. I forgot that the P4 introduced a pause instruction to reduce the >>>>>>>rate that spin loops execute at, creating less contention on the bus and >>>>>>>allowing the processor to run cooler. The pause instruction -ISN'T- privileged. >>>>>> >>>>>> >>>>>>The main issue is not bus contention. Everybody uses a "shadow lock" approach >>>>>>so that we spin on a cache value rather than repeatedly beating on the xchg >>>>>>instruction and frying the bus. But you don't want one thread spinning like >>>>>>mad doing no useful work, while the other thread actually holds the lock but >>>>>>is currently waiting for cpu cycles because the SMT scheduler has chosen to >>>>>>execute micro-ops from the _spinning_ thread. >>>>>> >>>>>>That is why crafty does poorly on a non-dedicated SMP machine, and >>>>>>hyper-threading simply produces the same problem. The pause will solve it >>>>>>for hyper-threading, but doesn't help a bit on the non-dedicated machine >>>>>>case. There spinlocks are worse than mutexes that physically block the >>>>>>thread, although I am playing with a sched_yield() system call that does >>>>>>the same thing to the linux kernel as the pause does to the SMT core. >>>>> >>>>>Oh, this is true. Cache snooping means you can spin on the cache value. >>>>> >>>>>Actually, Intel literature states that the pause instruction serves to prevent >>>>>the memory order violation condition that occurs when the spin loop exits. (I am >>>>>not sure WHY they incur a memory order violation when the spin loop exits.) It >>>>>also allows the processor to spin more efficiently. It can introduce a delay >>>>>rather than polling full-throttle. It may help Hyperthreading, but I don't think >>>>>this was their (sole) intention. >>>>> >>>>>For reference, they also state that the pause instruction may delay 0 cycles. >>>>>The memory order violation hint is their primary purpose. >>>>> >>>>>-Matt >>>> >>>> >>>>I didn't see the memory order issue. The white-paper from Intel I read >>>>simply mentioned the spinlock problem for those that are doing this... >>> >>>IA-32 Software Developer's Manual Vol. 2: Instruction Set Reference >>>Order Number 245271-006 >>> >>>Pause instruction: >>> >>>"Improves the performance of spin-wait loops. When executing a "spin-wait loop," >>>a Pentium 4 or Intel Xeon suffer a severe performance penalty when exiting the >>>loop because it detects a possible memory order violation..." >>> >>>It says nothing about an effect on hyperthreading, but I would presume that the >>>CPU is intelligent enough to do what you say. >>> >>>-Matt >> >> >>Eugene sent me a pointer to an intel white-paper on hyperthreading and >>spinlocks. The >>linux kernel guys also saw it when they received preproduction samples of >>SMT-enabled >>processors... > >Sigh. Typical of Intel. You know, the P4 manual still doesn't document opcodes >that have been present since the 8086. Ironically, AMD has documented them for >years, and some appear in the Opteron manuals. > >That is useful to know, though. I'll have to fix up my spinlocks with the pause >instruction. It's particularly useful since pause is a 2-byte alias for nop on >all other processors. > >-Matt The noop was a critical point, otherwise it would be non-portable to non-SMT processors.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.