Author: Robert Hyatt
Date: 18:40:20 11/08/03
Go up one level in this thread
On November 08, 2003 at 14:55:15, Robert Hyatt wrote: >On November 05, 2003 at 18:57:08, Eugene Nalimov wrote: > >>On November 05, 2003 at 18:17:03, Robert Hyatt wrote: >> >>>On November 05, 2003 at 16:41:51, Eugene Nalimov wrote: >>> >>>>On November 05, 2003 at 09:54:13, Robert Hyatt wrote: >>>> >>>>>On November 05, 2003 at 05:22:16, Ed Schröder wrote: >>>>> >>>>>>If you the choice between: >>>>>> >>>>>>1) AMD Opteron 244, 1.8 Ghz, S-940 Box >>>>>> >>>>>>and: >>>>>> >>>>>>2) AMD MP 2600+, 266Mhz >>>>>> >>>>>>then what would be the best choice regarding speed. >>>>>> >>>>>>I wonder... >>>>>> >>>>>>Ed >>>>> >>>>>for me, I'd take the opteron. >>>>> >>>>>Crafty gets about 2M nps on a 1.8ghz opteron... single processor. >>>> >>>>Not exactly. Following are 2 log files from (new version of) Crafty running on >>>>1.8GHz quad Opteron system. Run time vary from run to run, but those are typical >>>>ones >>>> >>>>1 CPU: 1,762knps >>>>4 CPUs: 6,856knps >>> >>>OK... I had done the calculation wrong. I thought that 6.8M for 4 was >>>basically 3.2X faster than 1, due to the NUMA scaling issues. It looks >>>from the above that it is now scaling almost 4:1 which is great. :) >>> >>>Now if my dual xeon would just scale 2.0 :) >> >>What is current number? I believe we improved it when you made some global >>per-thread one, no? >> >>Thanks, >>Eugene > > >Looks better (I just tested.) Seems to be back to the magic >1.9X (raw NPS is 1.9X faster with two processors than with >1. > >Here's the raw data. > >one cpu: > > time=1:25 cpu=99% mat=0 n=85541805 fh=91% nps=998k > time=55.41 cpu=99% mat=0 n=62193826 fh=95% nps=1122k > time=1:40 cpu=99% mat=-1 n=89355667 fh=94% nps=886k > time=1:18 cpu=99% mat=0 n=82339318 fh=92% nps=1050k > >two cpus (SMT off): > time=49.12 cpu=198% mat=0 n=91626204 fh=91% nps=1865k > time=27.55 cpu=198% mat=0 n=58868942 fh=95% nps=2136k > time=1:00 cpu=198% mat=-1 n=101092946 fh=94% nps=1669k > time=45.56 cpu=197% mat=0 n=89351627 fh=92% nps=1961k > >four cpus (SMT on): > time=50.32 cpu=392% mat=0 n=105665041 fh=91% nps=2099k > time=23.92 cpu=388% mat=0 n=57409674 fh=95% nps=2400k > time=57.60 cpu=392% mat=-1 n=108568676 fh=93% nps=1884k > time=40.88 cpu=396% mat=0 n=91017384 fh=92% nps=2226k I didn't have time to analyze the data above, but I notice that since I have been doing the NUMA-specific fixes, which also have to do with cache coherency issues, my SMT performance is no longer what it was a while back. IE from the raw NPS numbers, it seems to be about 10% faster now with SMT on than off. Probably explained by the less frequent cache line loading for a specific shared variable that was causing problems earlier... SMT on is still faster with a parallel search, for me, but the difference is not as stark as it was 6 months ago when this topic came up initially...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.