Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Here are some actual numbers

Author: Robert Hyatt

Date: 23:06:48 04/12/03

Go up one level in this thread


On April 12, 2003 at 12:52:17, Eugene Nalimov wrote:

>On April 12, 2003 at 01:24:33, Robert Hyatt wrote:
>
>>On April 11, 2003 at 23:47:23, Keith Evans wrote:
>>
>>>On April 11, 2003 at 23:26:35, Robert Hyatt wrote:
>>>
>>>>On April 11, 2003 at 16:53:59, Tom Kerrigan wrote:
>>>>
>>>>>On April 11, 2003 at 10:58:29, Robert Hyatt wrote:
>>>>>
>>>>>>I have explained "why not" before.
>>>>>>
>>>>>>My configuration is a dual 2.8.  I can't remove a CPU because I don't have a
>>>>>>terminator to
>>>>>>stick in the socket.  So I am stuck with two.  I can enable or disable SMT when
>>>>>>I boot the
>>>>>>machine.
>>>>>>
>>>>>>now tell me how to run the test.  Two copies might run on one physical cpu
>>>>>>(using two
>>>>>>logical cpus).  Or they might run on two physical cpus.  I have no control over
>>>>>>that.  And
>>>>>>they will bounce around between processors as they run.
>>>>>>
>>>>>>Your turn.  Tell me how to run a valid test and I'll let 'er rip.
>>>>>
>>>>>Actually a friend of mine has access to a P4/3.06 and I ran the test myself.
>>>>>Took less than 5 minutes.
>>>>>
>>>>>I opened two instances of my program and had them search the same position
>>>>>simultaneously and compared their NPS after ~10 seconds. I did this three times.
>>>>>Task Manager showed that both logical processors were pegged. The NPS ratios
>>>>>were:
>>>>>
>>>>>51%-49%
>>>>>49%-51%
>>>>>48%-52%
>>>>>
>>>>>It's pretty darn obvious that HT does not favor one logical processor more than
>>>>>another. (Contrary to Hyatt and Vincent's assertions.)
>>>>>
>>>>>You should thank me, Bob. Your hands must be really tired from all that waving.
>>>>>
>>>>>-Tom
>>>>
>>>>
>>>>First, I didn't say it did or it didn't.  I said that tests suggest that there
>>>>can be imbalances.
>>>>
>>>>Second, you found a result for _one_ test.  What about one that does a lot of
>>>>memory reads?  Memory writes?  Mixture?
>>>>
>>>>There are _lots_ of tests to do.
>>>
>>>Also I believe that he said that HT didn't improve his program's performance. So
>>>you may see different behavior for Crafty which is helped by HT.
>>
>>I ran the test Tom suggested.  Two different ways.
>>
>>First, four different threads.  Results were a pretty even balance, varying
>>from 45-55, to 49-51 depending on the run.  Not bad.
>>
>>Then two programs using two threads each, using a patched kernel that let me
>>lock a thread to a processor.  Result was wildly varying.  with a best of 60-40
>>and a worst of 75-25.  Why that is I have absolutely no idea.  But even more
>>interesting is that the two threads seem to "lose" time for reasons unknown at
>>the moment.  IE total time increases by about 30-50% which I don't understand at
>>all.  This still points to some odd cache issue I believe, and it seems to
>>really influence SMT in a strange way...
>>
>>I'm trying to understand the two-thread results as they are probably related to
>>the problem Vincent pointed out last week (NPS about 1.5X a single using a dual
>>with no SMT at all.)  Something is definitely fishy when I use threads.  And
>>the balance between CPUS is nowhere near 50-50 for some reason...
>
>Had you try my suggestion -- align your main structure into 128 bytes boundary?
>I.e. replace
>  tree[i] = (TREE*)malloc(sizeof(TREE));
>by
>  tree[i] = (TREE*)((~(size_t)127) & (127+(size_t)malloc(sizeof(TREE)+127)));
>
>
>Thanks,
>Eugene


Yes I tried that, and it didn't make a difference.

I really think this is an issue with invalidating cache lines that are 128
bytes long when one thread modifies something and the other has it in cache.
IE I even tried making a one-thread version that put the tree structure on
a really ugly boundary and it didn't hurt it at all based on NPS...  but
the thread version certainly takes a hit somewhere, and the longer line size
is the only thing that stands out, and since it happens only between threads,
but not between separate processes, the cache invalidate stuff is the first
thing that pops to mind...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.