Computer Chess Club Archives


Search

Terms

Messages

Subject: DIEP NUMA SMP at P4 3.06Ghz with Hyperthreading

Author: Vincent Diepeveen

Date: 11:08:29 12/13/02


hello,

Here some testresults of DIEP thanks to Chad Cowan at an
asus motherboard with HT turned on (amazingly no longer
SMT called, i forgot which manufacturer calls it HT and
which one SMT. I guess it's Hyperthreading now for intel).

HT turned on in all cases:

bus 533Mhz memory 133Mhz (DDR SDRAM cas2)
single cpu P4 3.105Ghz (bus 135 Mhz by default, not 133) : 101394
single cpu P4 3.105Ghz now 2 processes DIEP              : 120095

So speedup like 18% for HT. Not bad. Not good either, knowing diep
hardly locks.

However there is 1 problem i have with it when i compare that speed
of the same version with 2.4Ghz northwood.

That 2.4Ghz is exactly the speed of a K7 at 1.6ghz

Now the same K7 same version logs:
    single cpu : 82499
    dual       : 154293

Note that the k7 has way way slower RAM and chipset. 133Mhz registered cas 2.5
i guess versus fast cas 2 (like 2T less for latency, so 10 versus 12T or
something) for the P4. The P4 was a single cpu.

but here the math for those who still read here that's interesting to
hear.

Single cpu speed difference is:
  P4 3.06Ghz is faster : 22.9%

Based upon the speed where it is clocked at (3105Mhz)
we would expect a speedup of 3.105 / 2.4 = 29.4%

So somehow we lose around 7% in the process.

Now it wins another 18% or so when it gets run with 2 processes.
If i compare that with a single cpu K7 to get the relative
speed of a P4 Ghz versus a K7 Ghz then we get next compare:

1.6Ghz * (120k / 82k) = 2.33Ghz

so a 2.33Ghz K7 should be equally fast to a P4 at such a speed.
Of course assuming linearly scaling.

Now we calculate what 1Ghz K7 compares to in speed with P4: 1.33

So DDR ram proves to be the big winner for the P4. SMT in itself
is just a trick that works for me because my parallellism is
pretty ok and most likely not for everyone.

Now of course it's questionable whether that 18% speedup in nodes
a second also results in actual positive speedup in plydepth.

For DIEP it is, but it's not so impressive at all.

Because a dual Xeon 2.8Ghz which i will assume also having a compare
of 1.4 then (assuming not cas2 ddr ram but of course ecc registered
which eats extra time)

That means that the equivalent K7 will be a dual K7 2.0Ghz, thereby
still not taking into account 3 things

  a) my diep version was msvc compiled with processorpack (sp4)
     so it was simply not optimized for K7 at all, but more for p4
     than it was optimized for K7. Not using MMX of course (would
     slow down on P4 and let the K7 look relatively better).
  b) speedup at 4 processors is a lot worse than at 2 processors
     so when i run diep with 4 processes at the dual Xeon 2.8
     the expectation is that the K7 dual 2.0 Ghz will outgun it
     by quite some margin.
  c) that dual k7 2.0Ghz is less than half the price of a dual P4 2.8Ghz

Best regards,
Vincent




















This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.