Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Multiple processors on one chip...

Author: Robert Hyatt

Date: 07:19:59 03/05/00

Go up one level in this thread


On March 05, 2000 at 03:10:57, Tom Kerrigan wrote:

>On March 04, 2000 at 23:17:45, Robert Hyatt wrote:
>
>>>In that case, I don't think it's possible to use Crafty to compare the processor
>>>cores. The TSCP benchmarks give much more accurate data in that regard.
>>>
>>>-Tom
>>
>>
>>Only for small programs.  What about programs with larger cache footprints?
>
>I don't see why you want to bring the cache into this, if you just want to
>compare the cores. (Which I do.)


two reasons.  (1) if a program fits totally in cache, you are testing one
aspect of the cpu.  If the program doesn't fit into cache, you add memory
bandwidth and cache miss handling into the equation.  (2) "core" speed is
only important for a program that fits completely in cache.  I am not aware
of many such programs that do useful work.  Such a benchmark is not so useful
for the 99+ percent of the programs that don't fit entirely within cache.


>
>>Which means no register jams occur in the program.  For more complex programs,
>>the renaming logic in the P6 avoids many register jams/spills and does much
>>better keeping both pipes filled.
>
>Do you have any proof of this?

This is a trivial thing to consider.  If you have 1 register, try to figure
out a way to keep 2 instruction pipes busy, since both can't update that one
register in one cycle, nor can one update while the other reads.  If you have
6 registers, you can feed in 2 instructions that use 2 registers and modify
a third.  But then how do you fill up other pipeline stages?

My architecture books give this as the classic trade-off which explains why
RISC processors have a bunch of registers to avoid (a) jams (running out of
registers and stalling until one becomes free) and (b) spills (having to dump
registers to memory to free them up for necessary calculations.




>
>Here's a simplification of the issue, but it helps to illustrate the problem:
>
>The P5 has a 5 stage pipeline. For it to be full, it needs to be executing
>2*5=10 instructions at once. The P6 has a 12 stage pipeline. It needs 12*2=24
>instructions to be full.
>
>So if somebody told me that the Pentium's pipes are usually more full than the
>P6's pipes, I would have absolutely no problem believing it. Even if the P6 does
>have all sorts of fancy features.
>
>-Tom


True...  but the P5 pipeline isn't "more full" unless you run a very simple
program which has simple calculations requiring just a few registers.  The
renaming is important for handling problems on an architecture that has such
a paltry number of registers as the base Intel architecture.  But note Intel
isn't the only vendor using renaming...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.