Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: But, Re: Questions re P4 3.03 with HT ??

Author: Robert Hyatt

Date: 17:00:11 12/10/02

On December 10, 2002 at 16:43:29, Matt Taylor wrote:

>On December 10, 2002 at 16:35:11, Robert Hyatt wrote:
>
>>On December 10, 2002 at 14:31:51, Matt Taylor wrote:
>>
>>>On December 10, 2002 at 13:18:45, Robert Hyatt wrote:
>>>
>>>>On December 10, 2002 at 12:31:46, Matt Taylor wrote:
>>>>
>>>>>On December 10, 2002 at 12:21:33, Robert Hyatt wrote:
>>>>>
>>>>>>On December 10, 2002 at 11:34:45, Jeremiah Penery wrote:
>>>>>>
>>>>>>>On December 10, 2002 at 10:57:40, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>On December 10, 2002 at 09:08:10, Vincent Diepeveen wrote:
>>>>>>>>
>>>>>>>>>Matt i don't know it for crafty or other crap products. Crafty as we
>>>>>>>>>see in test needs less nodes when running MT=2,
>>>>>>>>
>>>>>>>>I realize this is hard for you to do, but is it _possible_ that you can stick
>>>>>>>>to _real_ data when you post?  The above is _absolute_ crap.  Crafty does
>>>>>>>>_not_ "need less nodes when MT=2".  In some positions, yes, but in
>>>>>>>>more positions it needs _more_.  And for the average case it needs _more_.
>>>>>>>>
>>>>>>>>I don't know why you continue to post something that any person here can
>>>>>>>>refute simply by running the code.  I've done it for you many times.  The
>>>>>>>>above is false.  Please find something _else_ to wave your hands about.
>>>>>>>
>>>>>>>It came from the original data in this thread:
>>>>>>
>>>>>>So?  That is over 6 positions.  Using that to prove that a program searches
>>>>>>"fewer
>>>>>>nodes with mt=2" is total nonsense, as is the claim that a program +will+ search
>>>>>>fewer nodes overall using two threads.  It simply doesn't happen.  And it falls
>>>>>>in
>>>>>>the same class as the perpetual-motion machine...  It doesn't work...
>>>>>
>>>>>I like Cold Fusion a little better.
>>>>
>>>>I'm not going that far.  There is always a remote possibility that something
>>>>like that
>>>>might be possible given the right materials and conditions.  Perpetual motion is
>>>>another
>>>>thing entirely, as is a speedup > 2.0 with two processors.  :)
>>>
>>>Yeah. I like the Cold Fusion example because the data does not justify the
>>>claim. But yeah, it is difficult to see how a second processor would possibly
>>>create a speed-up of more than a factor of 2. Obviously if that (legitimately)
>>>happens, more than just the number of CPUs has changed.
>>>
>>>>>>>Crafty v18.15
>>>>>>>White(1): bench
>>>>>>>Running benchmark. . .
>>>>>>>......
>>>>>>>Total nodes: 97487547
>>>>>>>Raw nodes per second: 1160566
>>>>>>>Total elapsed time: 84
>>>>>>>SMP time-to-ply measurement: 7.619048
>>>>>>>White(1):
>>>>>>>-------------------------------------
>>>>>>>Crafty v18.15 (2 cpus)
>>>>>>>White(1): bench
>>>>>>>Running benchmark. . .
>>>>>>>......
>>>>>>>Total nodes: 94658095
>>>>>>>Raw nodes per second: 1314695
>>>>>>>Total elapsed time: 72
>>>>>>>SMP time-to-ply measurement: 8.888889
>>>>>>>
>>>>>>>
>>>>>>>>What is "a buggy crafty?"  And what is the 13-16%?  I posted _real_ data.  You
>>>>>>>>post fantasy without even having access to a box?  And that is fact???
>>>>>>>
>>>>>>>You can see also that the NPS speedup in that above data is 13%.
>>>>>>
>>>>>>For _one_ test...  With a version of the program that has a _known_ problem with
>>>>>>SMT.
>>>>>
>>>>>You mean the pause issue, or is there more than just that?
>>>>>
>>>>>-Matt
>>>>
>>>>Yes....  but not just in the Lock() code... there is a critical spin-wait that
>>>>needs a pause
>>>>otherwise one thread will be running in a spin-wait while the other thread is
>>>>waiting
>>>>to get scheduled and _it_ is the one that will give the "spinner" something to
>>>>work on.  :)
>>>
>>>Ah. I'm interested in seeing the results, but I'm not expecting a huge gain from
>>>using pause. If one thread is beating on the lock, it leaves the majority of the
>>>execution resources and bandwidth for the other logical thread. I don't think
>>>that reducing the polling rate of the L1 cache will affect results much.
>>>
>>>I guess the only thing we can say right now is, "We will see!"
>>>
>>>-Matt
>>
>>
>>Think about it for a minute.  You have two processes to schedule.  One is doing
>>something
>>useful, the other is busy spinning. So every chance the "spinner" gets, it
>>executes full-speed
>>ahead.  And while it is executing, the _other_ thread is sitting.  The CPU has a
>>50% chance
>>of choosing the _wrong_ thread when one is computing doing useful work and the
>>other is
>>spinning doing nothing but waiting on something to do...
>>
>>and that is what pause helps with, the "spinner" makes one pass thru the spin
>>loop and
>>then says "run the other thread now"...
>
>That's true for a scheduler on a single processor, but that's not how
>Hyperthreading works as I understand it. Then again, it is possible that the
>docs I read are wrong. (The last thing I read about HT was over 2 years ago.)
>

That is the way I have seen it described in various Intel white papers.  In
particular they refer to the cpu's "resource scheduler" and compare it to a
multiprogramming operating system that is running two processes concurrently.

>They said that HT allows -concurrent- scheduling of threads, but the threads
>obviously cannot make use of the same execution resources. If this is correct,
>one thread would be spinning (consuming bandwidth to the L1 cache) while the
>other thread was doing real work.

Again, think about what you just said, which is impossible to happen.  If one
thread is smoking the L1/L2 cache, then it is not waiting for _anything_ and
once it is scheduled it will execute until the cpu decides to flip to the other
thread.  Or until that thread does a pause.  Whichever comes first.




>
>For now I'm going to stick to what I have read. I'll poke around sometime later
>this week and see if I can find any updated material on the inner workings of
>HT.
>
>-Matt

Re: But, Re: Questions re P4 3.03 with HT ?? Jeremiah Penery 17:12:06 12/10/02
- Re: But, Re: Questions re P4 3.03 with HT ?? Robert Hyatt 17:18:16 12/10/02
  - Re: But, Re: Questions re P4 3.03 with HT ?? Jeremiah Penery 17:33:34 12/10/02
    - Re: But, Re: Questions re P4 3.03 with HT ?? Robert Hyatt 18:13:28 12/10/02
      - Re: But, Re: Questions re P4 3.03 with HT ?? Matt Taylor 18:19:18 12/10/02
        
        Re: But, Re: Questions re P4 3.03 with HT ?? Robert Hyatt 19:54:45 12/10/02
        
        Re: But, Re: Questions re P4 3.03 with HT ?? Matt Taylor 20:11:36 12/10/02
        
        Re: But, Re: Questions re P4 3.03 with HT ?? Eugene Nalimov 21:47:31 12/10/02
        
        Re: But, Re: Questions re P4 3.03 with HT ?? Robert Hyatt 21:28:09 12/10/02
        
        Re: But, Re: Questions re P4 3.03 with HT ?? Matt Taylor 23:34:33 12/10/02
        
        advantages versus disadvantage P4 Vincent Diepeveen 07:15:16 12/12/02
        
        Re: advantages versus disadvantage P4 Matt Taylor 18:16:55 12/13/02
        
        Re: advantages versus disadvantage P4 Robert Hyatt 20:05:39 12/13/02
        
        Re: advantages versus disadvantage P4 Matt Taylor 22:09:12 12/13/02
        
        Re: advantages versus disadvantage P4 Robert Hyatt 10:55:00 12/14/02
        
        Re: advantages versus disadvantage P4 Eugene Nalimov 22:38:31 12/13/02
        
        Re: advantages versus disadvantage P4 Robert Hyatt 10:52:23 12/14/02
        
        Re: advantages versus disadvantage P4 Robert Hyatt 11:04:49 12/12/02
        
        Re: But, Re: Questions re P4 3.03 with HT ?? Robert Hyatt 06:57:23 12/11/02
        
        Re: But, Re: Questions re P4 3.03 with HT ?? Jeremiah Penery 18:57:11 12/10/02
        
        Re: But, Re: Questions re P4 3.03 with HT ?? Matt Taylor 19:52:18 12/10/02
    - Re: But, Re: Questions re P4 3.03 with HT ?? Matt Taylor 18:03:13 12/10/02
      - Re: But, Re: Questions re P4 3.03 with HT ?? Robert Hyatt 18:14:33 12/10/02

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.