Author: Matt Taylor
Date: 00:13:27 02/12/03
Go up one level in this thread
On February 12, 2003 at 00:23:53, Robert Hyatt wrote: >On February 11, 2003 at 23:27:04, Tom Kerrigan wrote: > >>On February 11, 2003 at 23:11:09, Charles Roberson wrote: >> >>> >>> Out-of-order execution is nothing more than the ability to execute >>>instructions in an order different from the serial order in the code. >>>It has nothing to do with branching, but it enables other branching techniques. >>>OOOE is simply: >>> 1) the code has instructions a,b,c,d, in that order >>> 2) if there are no serial dependencies then they can be executed in the >>> b,d,c,a order. >>> >>> That is all OOOE is. >> >>I don't see how this is different from what I said. Branches are instructions >>too. >> >>-Tom > > >What he is saying is that whatever the hardware can do with OOO execution, >the compiler can replicate it by massaging the instruction stream with well- >known optimization tricks. With the sole exception of register renaming. > >The reason OOO execution works so well on Intel is _solely_ based on the >fact that the architecture has almost no registers. And renaming lets the >hardware expand that number of registers _significantly_ so that the >architecture can do things that other less-register-challenged architectures >can do without OOO execution as a crutch... > >IE I can show you code for the Cray that executes an instruction every cycle >that an instruction can execute, yet it is a serial-order execution processor >from the ground-up, but with help from a _really_ good instruction scheduler >pass after the final object code has been generated... This scheduler can >replicate/hoist instructions as needed to back them up to the point that their >result is ready the cycle it is needed... Some of my bitscan code for the Athlon executed a useful instruction in every slot -- 3 IPC in 15-20 cycles of code. The sole enabling factor was the fact that I moved instructions everywhere. It was a nightmare to debug when I accidentally moved instructions in front of their dependencies. One of the biggest gains I had was moving register loads a fair number of cycles backward when I had free slots. This is difficult on IA-32 for obvious reasons, but it works very well when you have a larger number of registers. -Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.