Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: IA-64 vs OOOE (attn Taylor, Hyatt)

Author: Matt Taylor

Date: 00:13:27 02/12/03

Go up one level in this thread


On February 12, 2003 at 00:23:53, Robert Hyatt wrote:

>On February 11, 2003 at 23:27:04, Tom Kerrigan wrote:
>
>>On February 11, 2003 at 23:11:09, Charles Roberson wrote:
>>
>>>
>>>  Out-of-order execution is nothing more than the ability to execute
>>>instructions in an order different from the serial order in the code.
>>>It has nothing to do with branching, but it enables other branching techniques.
>>>OOOE is simply:
>>>   1) the code has instructions a,b,c,d, in that order
>>>   2) if there are no serial dependencies then they can be executed in the
>>>       b,d,c,a order.
>>>
>>>    That is all OOOE is.
>>
>>I don't see how this is different from what I said. Branches are instructions
>>too.
>>
>>-Tom
>
>
>What he is saying is that whatever the hardware can do with OOO execution,
>the compiler can replicate it by massaging the instruction stream with well-
>known optimization tricks.  With the sole exception of register renaming.
>
>The reason OOO execution works so well on Intel is _solely_ based on the
>fact that the architecture has almost no registers.  And renaming lets the
>hardware expand that number of registers _significantly_ so that the
>architecture can do things that other less-register-challenged architectures
>can do without OOO execution as a crutch...
>
>IE I can show you code for the Cray that executes an instruction every cycle
>that an instruction can execute, yet it is a serial-order execution processor
>from the ground-up, but with help from a _really_ good instruction scheduler
>pass after the final object code has been generated...  This scheduler can
>replicate/hoist instructions as needed to back them up to the point that their
>result is ready the cycle it is needed...

Some of my bitscan code for the Athlon executed a useful instruction in every
slot -- 3 IPC in 15-20 cycles of code. The sole enabling factor was the fact
that I moved instructions everywhere. It was a nightmare to debug when I
accidentally moved instructions in front of their dependencies.

One of the biggest gains I had was moving register loads a fair number of cycles
backward when I had free slots. This is difficult on IA-32 for obvious reasons,
but it works very well when you have a larger number of registers.

-Matt



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.