Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Magic 200MHz

Author: Robert Hyatt

Date: 21:23:55 05/27/03

Go up one level in this thread


On May 27, 2003 at 22:56:54, Tom Kerrigan wrote:

>On May 27, 2003 at 16:39:08, Robert Hyatt wrote:
>
>>On May 27, 2003 at 13:23:24, Tom Kerrigan wrote:
>>
>>>On May 27, 2003 at 11:05:27, Robert Hyatt wrote:
>>>
>>>>>So how do you explain your statement that no OSs you've "tested" issue halts? I
>>>>>mean, Linux issues halts. Did you not "test" Linux?
>>>>
>>>>I'll leave that for _you_ to figure out.  You can find an explanation
>>>>in the "scheduler idle loop" code.
>>>
>>>Suck it up, Bob, and admit you were wrong. It's painfully obvious that you're
>>>not contradicting me, just handwaving and backpedaling enough to give yourself a
>>>heart attack. "I fiddle with the source code" and "I'll leave that for you to
>>>find out." Yeah, right, Bob. Do you think that if you continue with this asinine
>>>behavior, everybody will get confused and just assume you're right?
>>>
>>>From LinuxHQ, "The Linux Information Headquarters,"
>>>
>>>"Regardless, you should be aware that even if you don't enable any power
>>>management on your laptop, on the x86 architecture Linux will always issue the
>>>"hlt" instruction to your processor whenever nothing needs to be done. This
>>>results in lowering the power consumption of your CPU. Note that the system
>>>doesn't power down when it receives the hlt instruction; it just stops executing
>>>instructions until there is an interrupt."
>>>
>>>http://www.linuxhq.com/ldp/howto/mini/Battery-Powered/powermgm.html
>>>
>>>I can find a dozen other pages that say Linux issues halts at the drop of a hat.
>>>Just say the word and I'll make this even more embarrassing for you. (Although
>>>that's hard to imagine, isn't it?)
>>
>>Not embarassing for me at all.  The point I referred to was that the
>>scheduler _first_ spins for a while, _then_ issues a halt.  The "for a
>
>Right, well, you said it didn't issue halts. You can write a dozen posts about
>how exactly halts are issued but that doesn't change the fact that you were
>wrong.

You stopped reading _too soon_.  The post right below this one was pretty
clear in saying that the kernel I am running is _not_ issuing halts at all,
and the comments in the sched.c code seem to suggest that polling rather
than halting is slightly faster on SMP boxes, as well as being a better
power saver on PIV systems (which I don't understand at all).

Just check out that post for more details and comments taken directly from
the linux process scheduler.  To see if the halt was being done, I simply
stuffed a zero opcode in place of the "halt" to see if the kernel would
crash and burn.  It didn't.  I then stuffed a zero in the poll loop and
it instantly crashed.


>
>>>That's great, Bob. "It was not clear"?? How can you begin to justify telling me
>>>I'm wrong when "it was not clear" how many threads were running? (And, BTW, it
>>>became clear that multiple threads were running.)
>>
>>It never was 'clear' to me, which is _why_ I asked the question.
>>
>>However, back to your "idea" about shared write-combine buffers.  You claim
>
>Ha ha, sure, let's change the subject. No, wait, let's not. I want to see the
>post where it says single-thread RC5 slows down with HT enabled. You said that's
>what this thread is about, so where's the post, Bob? It should be easy to find,
>right? So where is it? Or are you going to blither on about how you weren't
>clear about something? (Well, that's pretty obvious.)

It's pretty clear what I was talking about.  My _first_ post specifically
asked the question about "what was being measured?  a single thread, or two
threads.?"  It was _very_ specific.




>
>>that if a single thread needs 5, then running two such threads will slow
>>such things down.  I'm not sure I buy that.  Because if you run _two_ threads
>>they need 10, and they get 8 combined.  The two threads combined should run
>>_faster_ combined, than a single thread.  I _still_ don't see any particularly
>>reasonable explanation for why two threads would run slower _combined_ than
>>one would run by itself.  Even if one thread needs all 8 WC buffers, running
>>would mean each gets 1/2 of what it needs, and combined they should run at the
>>_same_ speed.
>
>If a program uses 4 WC buffers effectively and you limit it to 3, then your
>writes don't get combined effectively, so you issue more writes, which sucks up
>your cache bandwidth, which stalls your program. Really, what's not to get?

The _other_ thread.  the _other_ thread.  What is _it_ doing?  It is taking
advantage of those "stalled cycles"...  So it isn't a total loss...


>
>>>Execution units aren't considered part of the processor's resources by Intel,
>>>but if you want to talk about them, fine.
>>
>>Then this becomes a "semantics argument".  Because in the world of
>>computing, "execution units" are part of the CPU.  The Cray has used this
>>design since the 1970's.
>
>We're talking about Intel here. Do you think you can bring yourself to use
>Intel's semantics?

I believe I am.  I can't imagine anyone saying "execution units are not a
part of the processors set of core resources."

Makes no sense to me.


>
>>>"The memory instruction queue and general instruction queues send uops to the
>>>five scheduler queues as fast as they can, alternating between uops for the two
>>>logical processors every clock cycle, as needed."
>>>
>>>So the execution units are split 50-50 temporally.
>>
>>except that _no_ application really drives a processor at 100% duty cycle.
>
>What does that have to do with anything? A program might not use all the reorder
>buffer entries. Does that mean you can't split the reorder buffer?

I have no idea what your question has to do with my previous statement.


>
>-Tom



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.