Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Memory benchmark comparison DDR333 vs RDRAM PC1066 !

Author: Matt Taylor

Date: 12:47:38 12/05/02

Go up one level in this thread


On December 05, 2002 at 10:30:08, Robert Hyatt wrote:

>On December 05, 2002 at 01:58:07, Matt Taylor wrote:
>
>>On December 04, 2002 at 23:53:50, Robert Hyatt wrote:
>>
>>>On December 04, 2002 at 12:54:05, Matt Taylor wrote:
>>>
>>>>On December 03, 2002 at 23:10:31, Robert Hyatt wrote:
>>>>
>>>>>On December 02, 2002 at 23:59:51, Matt Taylor wrote:
>>>>>
>>>>>>>How can you use the hlt instruction? It's privileged, and you're in ring 3.
>>>>>>
>>>>>>Nevermind. I forgot that the P4 introduced a pause instruction to reduce the
>>>>>>rate that spin loops execute at, creating less contention on the bus and
>>>>>>allowing the processor to run cooler. The pause instruction -ISN'T- privileged.
>>>>>
>>>>>
>>>>>The main issue is not bus contention.  Everybody uses a "shadow lock" approach
>>>>>so that we spin on a cache value rather than repeatedly beating on the xchg
>>>>>instruction and frying the bus.  But you don't want one thread spinning like
>>>>>mad doing no useful work, while the other thread actually holds the lock but
>>>>>is currently waiting for cpu cycles because the SMT scheduler has chosen to
>>>>>execute micro-ops from the _spinning_ thread.
>>>>>
>>>>>That is why crafty does poorly on a non-dedicated SMP machine, and
>>>>>hyper-threading simply produces the same problem.  The pause will solve it
>>>>>for hyper-threading, but doesn't help a bit on the non-dedicated machine
>>>>>case.  There spinlocks are worse than mutexes that physically block the
>>>>>thread, although I am playing with a sched_yield() system call that does
>>>>>the same thing to the linux kernel as the pause does to the SMT core.
>>>>
>>>>Oh, this is true. Cache snooping means you can spin on the cache value.
>>>>
>>>>Actually, Intel literature states that the pause instruction serves to prevent
>>>>the memory order violation condition that occurs when the spin loop exits. (I am
>>>>not sure WHY they incur a memory order violation when the spin loop exits.) It
>>>>also allows the processor to spin more efficiently. It can introduce a delay
>>>>rather than polling full-throttle. It may help Hyperthreading, but I don't think
>>>>this was their (sole) intention.
>>>>
>>>>For reference, they also state that the pause instruction may delay 0 cycles.
>>>>The memory order violation hint is their primary purpose.
>>>>
>>>>-Matt
>>>
>>>
>>>I didn't see the memory order issue.  The white-paper from Intel I read
>>>simply mentioned the spinlock problem for those that are doing this...
>>
>>IA-32 Software Developer's Manual Vol. 2: Instruction Set Reference
>>Order Number 245271-006
>>
>>Pause instruction:
>>
>>"Improves the performance of spin-wait loops. When executing a "spin-wait loop,"
>>a Pentium 4 or Intel Xeon suffer a severe performance penalty when exiting the
>>loop because it detects a possible memory order violation..."
>>
>>It says nothing about an effect on hyperthreading, but I would presume that the
>>CPU is intelligent enough to do what you say.
>>
>>-Matt
>
>
>Eugene sent me a pointer to an intel white-paper on hyperthreading and
>spinlocks.  The
>linux kernel guys also saw it when they received preproduction samples of
>SMT-enabled
>processors...

Sigh. Typical of Intel. You know, the P4 manual still doesn't document opcodes
that have been present since the 8086. Ironically, AMD has documented them for
years, and some appear in the Opteron manuals.

That is useful to know, though. I'll have to fix up my spinlocks with the pause
instruction. It's particularly useful since pause is a 2-byte alias for nop on
all other processors.

-Matt



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.