Author: Robert Hyatt
Date: 07:19:59 03/05/00
Go up one level in this thread
On March 05, 2000 at 03:10:57, Tom Kerrigan wrote: >On March 04, 2000 at 23:17:45, Robert Hyatt wrote: > >>>In that case, I don't think it's possible to use Crafty to compare the processor >>>cores. The TSCP benchmarks give much more accurate data in that regard. >>> >>>-Tom >> >> >>Only for small programs. What about programs with larger cache footprints? > >I don't see why you want to bring the cache into this, if you just want to >compare the cores. (Which I do.) two reasons. (1) if a program fits totally in cache, you are testing one aspect of the cpu. If the program doesn't fit into cache, you add memory bandwidth and cache miss handling into the equation. (2) "core" speed is only important for a program that fits completely in cache. I am not aware of many such programs that do useful work. Such a benchmark is not so useful for the 99+ percent of the programs that don't fit entirely within cache. > >>Which means no register jams occur in the program. For more complex programs, >>the renaming logic in the P6 avoids many register jams/spills and does much >>better keeping both pipes filled. > >Do you have any proof of this? This is a trivial thing to consider. If you have 1 register, try to figure out a way to keep 2 instruction pipes busy, since both can't update that one register in one cycle, nor can one update while the other reads. If you have 6 registers, you can feed in 2 instructions that use 2 registers and modify a third. But then how do you fill up other pipeline stages? My architecture books give this as the classic trade-off which explains why RISC processors have a bunch of registers to avoid (a) jams (running out of registers and stalling until one becomes free) and (b) spills (having to dump registers to memory to free them up for necessary calculations. > >Here's a simplification of the issue, but it helps to illustrate the problem: > >The P5 has a 5 stage pipeline. For it to be full, it needs to be executing >2*5=10 instructions at once. The P6 has a 12 stage pipeline. It needs 12*2=24 >instructions to be full. > >So if somebody told me that the Pentium's pipes are usually more full than the >P6's pipes, I would have absolutely no problem believing it. Even if the P6 does >have all sorts of fancy features. > >-Tom True... but the P5 pipeline isn't "more full" unless you run a very simple program which has simple calculations requiring just a few registers. The renaming is important for handling problems on an architecture that has such a paltry number of registers as the base Intel architecture. But note Intel isn't the only vendor using renaming...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.