Author: Eugene Nalimov
Date: 09:52:17 04/12/03
Go up one level in this thread
On April 12, 2003 at 01:24:33, Robert Hyatt wrote: >On April 11, 2003 at 23:47:23, Keith Evans wrote: > >>On April 11, 2003 at 23:26:35, Robert Hyatt wrote: >> >>>On April 11, 2003 at 16:53:59, Tom Kerrigan wrote: >>> >>>>On April 11, 2003 at 10:58:29, Robert Hyatt wrote: >>>> >>>>>I have explained "why not" before. >>>>> >>>>>My configuration is a dual 2.8. I can't remove a CPU because I don't have a >>>>>terminator to >>>>>stick in the socket. So I am stuck with two. I can enable or disable SMT when >>>>>I boot the >>>>>machine. >>>>> >>>>>now tell me how to run the test. Two copies might run on one physical cpu >>>>>(using two >>>>>logical cpus). Or they might run on two physical cpus. I have no control over >>>>>that. And >>>>>they will bounce around between processors as they run. >>>>> >>>>>Your turn. Tell me how to run a valid test and I'll let 'er rip. >>>> >>>>Actually a friend of mine has access to a P4/3.06 and I ran the test myself. >>>>Took less than 5 minutes. >>>> >>>>I opened two instances of my program and had them search the same position >>>>simultaneously and compared their NPS after ~10 seconds. I did this three times. >>>>Task Manager showed that both logical processors were pegged. The NPS ratios >>>>were: >>>> >>>>51%-49% >>>>49%-51% >>>>48%-52% >>>> >>>>It's pretty darn obvious that HT does not favor one logical processor more than >>>>another. (Contrary to Hyatt and Vincent's assertions.) >>>> >>>>You should thank me, Bob. Your hands must be really tired from all that waving. >>>> >>>>-Tom >>> >>> >>>First, I didn't say it did or it didn't. I said that tests suggest that there >>>can be imbalances. >>> >>>Second, you found a result for _one_ test. What about one that does a lot of >>>memory reads? Memory writes? Mixture? >>> >>>There are _lots_ of tests to do. >> >>Also I believe that he said that HT didn't improve his program's performance. So >>you may see different behavior for Crafty which is helped by HT. > >I ran the test Tom suggested. Two different ways. > >First, four different threads. Results were a pretty even balance, varying >from 45-55, to 49-51 depending on the run. Not bad. > >Then two programs using two threads each, using a patched kernel that let me >lock a thread to a processor. Result was wildly varying. with a best of 60-40 >and a worst of 75-25. Why that is I have absolutely no idea. But even more >interesting is that the two threads seem to "lose" time for reasons unknown at >the moment. IE total time increases by about 30-50% which I don't understand at >all. This still points to some odd cache issue I believe, and it seems to >really influence SMT in a strange way... > >I'm trying to understand the two-thread results as they are probably related to >the problem Vincent pointed out last week (NPS about 1.5X a single using a dual >with no SMT at all.) Something is definitely fishy when I use threads. And >the balance between CPUS is nowhere near 50-50 for some reason... Had you try my suggestion -- align your main structure into 128 bytes boundary? I.e. replace tree[i] = (TREE*)malloc(sizeof(TREE)); by tree[i] = (TREE*)((~(size_t)127) & (127+(size_t)malloc(sizeof(TREE)+127))); Thanks, Eugene
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.