Author: Vincent Diepeveen
Date: 11:08:29 12/13/02
hello, Here some testresults of DIEP thanks to Chad Cowan at an asus motherboard with HT turned on (amazingly no longer SMT called, i forgot which manufacturer calls it HT and which one SMT. I guess it's Hyperthreading now for intel). HT turned on in all cases: bus 533Mhz memory 133Mhz (DDR SDRAM cas2) single cpu P4 3.105Ghz (bus 135 Mhz by default, not 133) : 101394 single cpu P4 3.105Ghz now 2 processes DIEP : 120095 So speedup like 18% for HT. Not bad. Not good either, knowing diep hardly locks. However there is 1 problem i have with it when i compare that speed of the same version with 2.4Ghz northwood. That 2.4Ghz is exactly the speed of a K7 at 1.6ghz Now the same K7 same version logs: single cpu : 82499 dual : 154293 Note that the k7 has way way slower RAM and chipset. 133Mhz registered cas 2.5 i guess versus fast cas 2 (like 2T less for latency, so 10 versus 12T or something) for the P4. The P4 was a single cpu. but here the math for those who still read here that's interesting to hear. Single cpu speed difference is: P4 3.06Ghz is faster : 22.9% Based upon the speed where it is clocked at (3105Mhz) we would expect a speedup of 3.105 / 2.4 = 29.4% So somehow we lose around 7% in the process. Now it wins another 18% or so when it gets run with 2 processes. If i compare that with a single cpu K7 to get the relative speed of a P4 Ghz versus a K7 Ghz then we get next compare: 1.6Ghz * (120k / 82k) = 2.33Ghz so a 2.33Ghz K7 should be equally fast to a P4 at such a speed. Of course assuming linearly scaling. Now we calculate what 1Ghz K7 compares to in speed with P4: 1.33 So DDR ram proves to be the big winner for the P4. SMT in itself is just a trick that works for me because my parallellism is pretty ok and most likely not for everyone. Now of course it's questionable whether that 18% speedup in nodes a second also results in actual positive speedup in plydepth. For DIEP it is, but it's not so impressive at all. Because a dual Xeon 2.8Ghz which i will assume also having a compare of 1.4 then (assuming not cas2 ddr ram but of course ecc registered which eats extra time) That means that the equivalent K7 will be a dual K7 2.0Ghz, thereby still not taking into account 3 things a) my diep version was msvc compiled with processorpack (sp4) so it was simply not optimized for K7 at all, but more for p4 than it was optimized for K7. Not using MMX of course (would slow down on P4 and let the K7 look relatively better). b) speedup at 4 processors is a lot worse than at 2 processors so when i run diep with 4 processes at the dual Xeon 2.8 the expectation is that the K7 dual 2.0 Ghz will outgun it by quite some margin. c) that dual k7 2.0Ghz is less than half the price of a dual P4 2.8Ghz Best regards, Vincent
This page took 0.02 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.