Author: Eugene Nalimov
Date: 08:14:25 04/15/04
Go up one level in this thread
On April 15, 2004 at 09:01:44, Robert Hyatt wrote: >On April 15, 2004 at 06:05:15, Joachim Rang wrote: > >>On April 14, 2004 at 22:49:39, Robert Hyatt wrote: >> >>>I just finished some HT on / HT off tests to see how things have changed in >>>Crafty since some of the recent NUMA-related memory changes that were made. >>> >>>Point 1. HT now speeds Crafty up between 5 and 10% max. A year ago this was >>>30%. What did I learn? Nothing new. Memory waits benefit HT. Eugene and I >>>worked on removing several shared memory interactions which led to better cache >>>utilization, less cache invalidates (very slow) and improved performance a good >>>bit. But at the same time, now HT doesn't have the excessive memory waits it >>>had before and so the speedup is not as good. >>> >>>Point 2. HT now actually slows things down due to SMP overhead. IE I lose 30% >>>per CPU, roughly, due to SMP overhead. HT now only gives 5-10% back. This is a >>>net loss. I am now running my dual with HT disabled... >>> >>>More as I get more data... Here is two data points however: >>> >>>pos1. cpus=2 (no HT) NPS = 2.07M time=18.13 >>> cpus=4 NPS = 2.08M time=28.76 >>> >>>pos2. cpus=2 NPS = 1.87M time=58.48 >>> cpus=4 NPS = 2.01M time=66.00 >>> >>>First pos HT helps almost none in NPS, costs 10 seconds in search overhead. >>>Ugly. Position 2 gives about 5% more nps, but again the SMP overhead washes >>>that out and there is a net loss. I should run the speedup tests several times, >>>but the NPS numbers don't change much, and the speedup could change. But this >>>offers enough.. >> >> >>In a german Board someone postetd figures for the Fritzmark of Fritz 8. Fritz >>gains still 25% form HT (in this specific position) >> >>cpus=2 NPS = 2.35 >>cpus=4 NPS = 2,95 >> >>I have unfortunately no information about search time. >> >>Does that mean Fritz 8 is poorly optimized? >> >>regards Joachim > > >It means it has some cache issues that can be fixed to speed it up further, yes. ...or that Fritz has less instruction-level parallelism, so there is lot of idle execution units. Crafty is special due to bitboards... Thanks, Eugene
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.