Author: fca
Date: 04:59:04 08/13/98
Go up one level in this thread
On August 12, 1998 at 13:55:01, Tom Kerrigan wrote: >On August 12, 1998 at 10:44:40, fca wrote: > >>You wrote: >>"Aside from software differences, the Pentium MMX/200 has a 66MHz L2 cache >>(possibly smaller than 512k) whereas the Pentium II/300 has a 512k 150MHz L2 >>cache. If a program really bangs on the L2 cache, it will go much faster on the >>Pentium II." >>But since the core (P2/300 vs P200MMX) is so much faster, the extra/faster cache >>(even if accessed a lot) simply serves to alleviate what the faster core would >>*otherwise* have made into a bottleneck. L2-hit rates etc suggest in itself >>this would not be able to increase the speed ratio above that the cores deliver. > >Sort of. Notice that the 200/66 core clock speed/L2 cache speed ratio is much >less than the 300/150 ratio, so if L2 cache is a bottleneck on the Pentium >MMX/200, then it's less of a bottleneck on the Pentium II/300. But... 1. Comparing cores is not just a matter of comparing MHz, as of course you know. The P6 is more efficient (ignoring cacheing considerations) in many subtle ways. 2. Of course not efficient enough to mitigate a 3:1 vs 2:1 (L2 speed vs core MHz ratios)... but a poster whose technical viewpoint is one with which I usually agree stated that: 'The P6 core is designed to be less dependent on the L2 cache, too.' ;-) I believe 1. & 2. together make my "case" as pasted above by you... >>So, I am still surprised at the 2.5x reported. Aren't you? Actually, thanks to clarification by blass as to what he was running (the effect of the 16-bit F5 "harness" on J) and some benchmarks kindly posted by Amir, we are not headed for 2.5x any more. I view this discussion as more of a "are we correctly understanding the effect of core changes and L2 changes on chess nps" one. >Not really. Consider this: >Pentium MMX/200 = 1 >Pentium MMX/300 = 1.5 (assume linear scaling) >Pentium MMX/300 * 1.66 = 2.5 (66% improvement from P5 -> P6 core) Ah - you are dropping the "L2-hit-hard" theory now? Perhaps not. :-) In respect of the above scalings: As I do not believe there was an MMX300 only a P2/300 (ignoring celeron), I interpret the above as 2.5 = (300/200 MHz linear scaling) x 1.66 (a claimed P5 -> P6 core change effect). I realise you wrote it this way to progress from 200MMX to P2/300 in steps. I disagree with the validity of claiming linear scaling (i.e. MHz dependence) with the same core-type and L2 size/speed. It is *simply not* backed-up by evidence (hosts of evidence from benchmarks at www,intel.com - . It assumes no bottlenecks. I can't quote you data for the MMX200:MMX300 (no such CPU) so I choose the closest P5 equivalent, MMX166:MMX233. For all these, the m/b bus operated at the same 66MHz, just like with the original. SPECint (base)95 - Unix SPECint 95 - NT40 MMX166 5.60 5.54 MMX233 7.12 7.02 P2/233 9.47 9.44 P2/333 12.8 12.7 Other non-f.p. benchmarks make my "case" even better. We first consider: MMX233:MMX166 1.27x 1.27x But the MHz ratio = 233/166 = 1.40x ;-) And of course the benchmarks do not stress L2 etc as a chess program might. If they did, hmmmm. The L2 cache speed 66MHz is the same for both MMX166 and MMX233, but while size is 512K for the 233 it _might_ be just 256K for the MMX166. Now if 512K, the stress would obviously be higher on the faster core - so 1.27 --> say 1.22 (just to put a number on it to avoid confusion). If 256K, (which seems unlikely else like was not being compared with like), it is possible that the 1.27x applies, or maybe very marginally higher (size less important than speed because of the chess-use). Comparing MHz ration for P6 core (here, "proportionate" L2 speed, but no size change) P2/333:P2/233 1.35x 1.35x But the MHz ratio = 333/233 = 1.43x ;-) So in summary I suggest the 1.5x you quote is more like 1.35x in chess practice. The 1.66x is also interesting. The above table is useful here too. P2/233 : MMX233 1.33x 1.34x Here both had 512K L2, but the P2/233's worked at 116.5 MHz and the other at 66MHz. Of course we remember that where all other things are equal (?!), L2 dependency is *reduced* with the P6 (=P2). So we have two factors suggesting that the 1.33x overstates it (same 512K size, reduced dependency) and one (L2 speed) that suggests the opposite. My guess is the c1.34x holds. This is supported by Ed's Rebel benchmarks (different program, I know, but closer to Junior than is SpecInt!). In any event, it appears inconceivable that the 1.34x could become 1.66x in a chess environment - chess programs simply don't do that sort of thing. >So it's not out of the question. And I say 2.5x is, based on the above. About 1.35^2, say 1.9x tops and that's pushing it. Bottlenecks are nasty things: bypass one, you just fall into the next. So - over to you! >-Tom Kind regards fca
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.