Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Try RC5 w/ HT

Author: Robert Hyatt

Date: 14:50:00 05/24/03

Go up one level in this thread


On May 24, 2003 at 16:04:27, Tom Kerrigan wrote:

>On May 24, 2003 at 01:13:39, Robert Hyatt wrote:
>
>>On May 23, 2003 at 23:51:54, Tom Kerrigan wrote:
>>
>>>On May 23, 2003 at 22:58:51, Robert Hyatt wrote:
>>>
>>>>On May 22, 2003 at 23:29:25, Aaron Gordon wrote:
>>>>
>>>>>On May 22, 2003 at 22:24:29, Robert Hyatt wrote:
>>>>>
>>>>>>On May 22, 2003 at 13:43:55, Tom Kerrigan wrote:
>>>>>>
>>>>>>>On May 21, 2003 at 22:20:57, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>On May 21, 2003 at 15:48:46, Tom Kerrigan wrote:
>>>>>>>>
>>>>>>>>>On May 21, 2003 at 13:46:26, Robert Hyatt wrote:
>>>>>>>>>
>>>>>>>>>>On May 20, 2003 at 13:52:01, Tom Kerrigan wrote:
>>>>>>>>>>
>>>>>>>>>>>On May 20, 2003 at 00:26:49, Robert Hyatt wrote:
>>>>>>>>>>>
>>>>>>>>>>>>Actually it _does_ surprise me.  The basic idea is that HT provides improved
>>>>>>>>>>>>resource utilization within the CPU.  IE would you prefer to have a dual 600mhz
>>>>>>>>>>>>or a single 1000mhz machine?  I'd generally prefer the dual 600, although for
>>>>>>>>>>>
>>>>>>>>>>>You're oversimplifying HT. When HT is running two threads, each thread only gets
>>>>>>>>>>>half of the core's resources. So instead of your 1GHz vs. dual 600MHz situation,
>>>>>>>>>>>what you have is more like a 1GHz Pentium 4 vs. a dual 1GHz Pentium. The dual
>>>>>>>>>>>will usually be faster, but in many cases it will be slower, sometimes by a wide
>>>>>>>>>>>margin.
>>>>>>>>>>
>>>>>>>>>>Not quite.  Otherwise how do you explain my NPS _increase_ when using a second
>>>>>>>>>>thread on a single physical cpu?
>>>>>>>>>>
>>>>>>>>>>The issue is that now things can be overlapped and more of the CPU core
>>>>>>>>>>gets utilized for a greater percent of the total run-time...
>>>>>>>>>>
>>>>>>>>>>If it were just 50-50 then there would be _zero_ improvement for perfect
>>>>>>>>>>algorithms, and a negative improvement for any algorithm with any overhead
>>>>>>>>>>whatsoever...
>>>>>>>>>>
>>>>>>>>>>And the 50-50 doesn't even hold true for all cases, as my test results have
>>>>>>>>>>shown, even though I have yet to find any reason for what is going on...
>>>>>>>>>
>>>>>>>>>Think a little bit before posting, Bob. I said that the chip's execution
>>>>>>>>>resources were evenly split, I didn't say that the chip's performance is evently
>>>>>>>>>split. That's just stupid. You have to figure in how those execution resources
>>>>>>>>>are utilized and understand that adding more of these resources gives you
>>>>>>>>>diminishing returns.
>>>>>>>>>
>>>>>>>>>-Tom
>>>>>>>>
>>>>>>>>
>>>>>>>>You shold follow your own advice.  If resources are split "50-50" then how
>>>>>>>>can _my_ program produce a 70-30 split on occasion?
>>>>>>>>
>>>>>>>>It simply is _not_ possible.
>>>>>>>>
>>>>>>>>There is more to this than a simple explanation offers...
>>>>>>>
>>>>>>>Now you're getting off onto another topic here.
>>>>>>>
>>>>>>
>>>>>>Read backward.  _I_ did not "change the topic".
>>>>>>
>>>>>>I said that I don't see how it is possible for HT to slow a program down.
>>>>>>
>>>>>>You said "50-50" resource allocation might be an explanation.
>>>>>>
>>>>>>I said "that doesn't seem plausible because I have at least one example of
>>>>>>two compute-bound threads that don't show a 50-50 balance on SMT."
>>>>>>
>>>>>>If Eugene is right, and I don't know as he was not sure and I haven't read
>>>>>>anything similar to what he mentioned, that _could_ explain it (ie if some
>>>>>>resources are split 50-50 between the two logical processors even if one
>>>>>>could use more than the other due to the particular application being run.
>>>>>>However that seems like a _bad_ design decision if it is true...)  However
>>>>>>there are probably other plausible explanations as well.  What is the _real_
>>>>>>explanation?  That will likely take some time to figure out.
>>>>>>
>>>>>>
>>>>>>>Originally you were saying that it's impossible for HT to slow a program down
>>>>>>>unless there was something wrong with the algorithm.
>>>>>>
>>>>>>And based on testing here, I pretty well stick with that.  I won't say there
>>>>>>is _no_ program that will run slower, but I haven't found one myself.  And
>>>>>>again, to be clear, we are talking about one program, one thread.  Run on
>>>>>>a machine with SMT on and SMT off.  I've run that test repeatedly and can't
>>>>>>find any penalty for one thread when turning SMT on.  ANd I do mean _no
>>>>>>penalty_ on anything I have tried.  Kernel builds.  Compiles.  Running
>>>>>>Crafty.  Running various compute-bound applications like NAMD, a big monte-carlo
>>>>>>simulation, etc...
>>>>>>
>>>>>>The idea really doesn't make sense, IMHO.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>Now you're back to complaining about your 70-30 split, which is only related to
>>>>>>>the original topic because they both involve ratios like "50-50" and "70-30."
>>>>>>
>>>>>>That 70-30 was used simply to suggest that 50-50 is _not_ a "golden rule" in
>>>>>>SMT resource allocation, apparently.  Nothing more.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>-Tom
>>>>>
>>>>>
>>>>>Hyatt, grab distributed.net's RC5-72 client, it supports multiple cpus and with
>>>>>every dual system I've seen it run it on gets an exact 100% increase in
>>>>>nodes/second. Now, it only spawns 1 thread per processor & isn't memory
>>>>>intensive what so ever (that i've seen, only CPU clock speed affects results). A
>>>>>P4 with HT gets HALF the speed of a P4 w/o HT in some of the results I've seen,
>>>>>if you get the time try to verify that for me. I would have figured this would
>>>>>have been one of the programs HT would shine at. Complete surprise to me...  If
>>>>>you could, grab the linux RC5-72 client at:
>>>>
>>>>What are they measuring?
>>>>
>>>>IE running two copies _should_ see each copy run about 1/2 as fast with SMT
>>>>on, since each copy is getting roughly 50% of available cpu core resources
>>>>when running the same instruction streams.
>>>
>>>Huh, looks like Hyatt _can_ learn something... Still wrong, but closer.
>>>
>>>-Tom
>>
>>
>>Now if _you_ would only do the same...
>
>Heh. What am I going to learn from you? All you do is write posts about how you
>don't know what's going on.
>
>-Tom


If you would listen, you would learn a _lot_ from most everyone here.

However, until you learn how much you _don't_ know, your education won't
proceed...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.