Author: Tom Kerrigan
Date: 01:55:59 02/12/03
Go up one level in this thread
On February 12, 2003 at 00:37:13, Robert Hyatt wrote: >No No No. They do much of this with 100% accuracy. Because they make sure >that the critical instructions are executed in _every_ path that reaches a >critical point in the data-flow analysis of the program (the dependency graph >for gcc users)... You're not making any sense. You have a branch. You have two possible control paths. The instructions in each path are different. Which ones do you advance? >BTW OOOE has a huge limit. Something like 40-60 (I don't have my P6/etc >manuals here at home) micro-ops in the reorder buffer. No way to do any >OOOE beyond that very narrow peephole, while the compiler can see _much_ >more, as much as it wants (and has the compile time) to look at... Alright. So run compiled code on your OOO processor. >registers when the real instructions get turned into micro-ops... but at >least the latter is more a result of a horrible architecture (8 registers) >as opposed to the fact the OOO execution is a huge boon for other architectures >that are not so register-challenged... Funny, my 30% number was for the Alpha and MIPS chips. I wouldn't consider them register challenged. >Sure. But given the choice of OOOE with 8 int alus, or no OOOE with 16 >int alus and an instruction package large enough to feed them all, I would >consider the latter seriously... We have chips today with 9 execution units that retire, on average, one instruction per cycle, and you think you can fill 16 in slots? >The Cray T932 was the last 64 bit machine they built that I used. And it >can produce a FLOP count that no PC on the planet can come within a factor of >10 of and that is being very generous. 2ns clock, 32 cpus, each cpu can read >four words and write two words to memory per clock cycle, and with vector >chaining, it can do at _least_ eight floating point operations per cycle per >CPU. How many NPS does Crafty get on it? >I did a branchless FirstOne() in asm a few weeks back here, just to test. >It used a cmov, and it wasn't slower than the one with a branch. If the On a Pentium III? -Tom
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.