Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Here are some actual numbers

Author: Robert Hyatt

Date: 21:01:34 04/15/03

Go up one level in this thread


On April 15, 2003 at 14:56:38, Tom Kerrigan wrote:

>On April 14, 2003 at 23:41:10, Robert Hyatt wrote:
>
>>On April 14, 2003 at 18:19:28, Ricardo Gibert wrote:
>>
>>>On April 14, 2003 at 17:54:14, Robert Hyatt wrote:
>>>
>>>>On April 14, 2003 at 15:50:22, Tom Kerrigan wrote:
>>>>
>>>>>On April 13, 2003 at 11:21:51, Robert Hyatt wrote:
>>>>>
>>>>>>On April 13, 2003 at 02:37:57, Tom Kerrigan wrote:
>>>>>>
>>>>>>>On April 13, 2003 at 01:04:52, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>It _is_ pinned on SMT.  The two logical processors are producing wildly
>>>>>>>>imbalanced results when using threads, vs using two separate processes.  It
>>>>>>>>would appear to be cache-related...
>>>>>>>
>>>>>>>This is some sort of joke, right? You and Vincent see the same behavior, you
>>>>>>>have SMT and Vincent doesn't, and somehow the problem is with SMT?
>>>>>>.
>>>>>>
>>>>>>
>>>>>>The _variability_ is with SMT.  What are you talking about?  I reported _two_
>>>>>>issues.
>>>>>>
>>>>>>1.  My dual xeon runs two copies of crafty about 2x as fast as if they were
>>>>>>run one after the other.  So does my quad 700.
>>>>>>
>>>>>>2.  My dual xeon runs one copy, two threads, at about 1.5X the speed that it
>>>>>>should.
>>>>>>
>>>>>>That is a problem.
>>>>>>
>>>>>>The second issue is that my dual xeon does _not_ run threaded crafty in a
>>>>>>balanced way on two logical processors.   For two independent copies, it
>>>>>>varies from 50-50 to 45-55.  Not unreasonable.  But for the single threaded
>>>>>>copy, it varies all the way to 70-30.  _that_ is an SMT issue.  Probably, as
>>>>>>I mentioned, caused by some unknown L2 cache issue.  But it _is_ a problem
>>>>>>with SMT if you want to assume that normally it is about 50-50 roughly, for
>>>>>>_regular_ applications.
>>>>>>
>>>>>>shared memory, locks, etc are causing something strange to happen.
>>>>>
>>>>>It looks like you're having enough problems and unexplained behavior already
>>>>>that it's hard to trust any sort of numbers you post. But still, if the widest
>>>>>disparity you measured was 70-30, that seems like enough to dispel your notion
>>>>>that one thread always gets priority over the other.
>>>>>
>>>>>-Tom
>>>>
>>>>
>>>>How?
>>>>
>>>>70-30 is > 2:1.
>>>>
>>>>Something is going on.
>>>
>>>
>>>If the worst you could do by flipping a coin 1 million times is to get heads 70%
>>>of the time, one should conclude the coin is unbiased? I don't think so. You're
>>>right to think 70-30 is a significant result. There is some asymmetry (a bug?)
>>>going on where none is expected.
>>
>>My original idea was that somewhere along the way, you _must_ make a decision
>>about which of two things to do next.  Flipping a bit complicates the process
>>if it has to be done many times.  The old Cray did it by using the processor
>>ID to break ties.  It is possible that somewhere in the PIV core, there is
>>a tie-break that is not 50-50.  It is also possible that the  results I am
>>getting are somehow wrong...
>
>"Tie breaking" is not the issue. If you read the thing Anthony posted, you'd
>know that all the P4's resources are evenly divided: instruction window, reorder
>registers, reorder buffer, load/store queues, everything. How can you have
>unbalanced execution when each thread gets half of everything? I don't think you
>could if you wanted.
>
>When I suggested that you and Vincent had the same problem, with unbalanced
>processing, you bitched that you had really reported two problems. Well, that's
>not so clear to me. If your dual proc machine is searching 1.5x the NPS that it
>should, is it unreasonable to think that maybe one of the processors is somehow
>idle (spinning) half the time? Because if that's the case, running both threads
>on a HT processor could easily result in a 66-33 disparity in NPS per thread,
>which is pretty damn close to what you're seeing.
>
>Occam's razor, Bob.
>
>-Tom


But the principle doesn't apply here.  Why?  because I _know_ that one thread
is not "spinning" or "waiting" whatsoever.  I _carefully_ account for every
spin while waiting on work.  It averages about 5% out of 400% on a 3 minute
search.

There is more going on than meets the eye, from several perspectives.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.