Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: cmov isn't necessarily good

Author: Tom Kerrigan

Date: 16:15:26 07/21/03

Go up one level in this thread


On July 20, 2003 at 14:48:09, Robert Hyatt wrote:

>On July 19, 2003 at 02:11:34, Tom Kerrigan wrote:
>
>>On July 19, 2003 at 01:11:31, Robert Hyatt wrote:
>>
>>>On July 18, 2003 at 15:16:27, Tom Kerrigan wrote:
>>>
>>>>On July 18, 2003 at 04:05:52, Walter Faxon wrote:
>>>>
>>>>>>; 326  :     if (bbHalf) bb0 = bb1;              // will code as cmov (ideally)
>>>>>>
>>>>>>	test	ecx, ecx
>>>>>>	je	SHORT $L806
>>>>>>	mov	eax, DWORD PTR _bb$[esp]
>>>>>>$L806:
>>>>>>
>>>>>
>>>>>
>>>>>Stupid compiler, not only no cmov
>>>>
>>>>IIRC, on the P6 (Pentium Pro, Pentium II, Pentium III), the cmov instruction
>>>>gets translated into a string of uOps that's equivalent to testing, branching,
>>>>and copying.
>>>>
>>>>In other words, there is no performance benefit (I believe there may actually be
>>>>a performance penalty) to using cmov on a P6, and it breaks compatibility with
>>>>pre-P6 processors, so it's little wonder the P6-era MS compiler doesn't generate
>>>>cmovs.
>>>>
>>>>-Tom
>>>
>>>
>>>I think the point is that the cmov eliminates any possibility of a branch
>>>mis-prediction.  On the long PIV pipeline, that's a significant savings for
>>>mis-predicted branches.
>>>
>>>Since Eugene's example shows that the new MSVC compiler is going to finally
>>>emit cmov instructions, I'd assume there is a performance gain for doing
>>>so.
>>
>>Yes, of course, I thought I had made it perfectly clear that I was talking about
>>the _P6_ core. I wrote all of them out. Pentium Pro, Pentium II, Pentium III.
>>_Not_ Pentium 4.
>>
>>-Tom
>
>I don't see why it would be worse on a P6 core either.  IE on a P6, if the
>branch is mis-predicted, you _still_ have to back out all the stuff that has
>been speculatively executed, including any out-of-order stuff as well.  The
>CMOV eliminates a lot of that.

I'm sorry, but can you read at all? This is astounding.

The only point that my original post conveys is that for the _P6_ core the cmov
instruction gets translated into _branch_ and copy uOps.

You've already managed to miss one key point, namely that I was talking about
the P6 code. You write a big post about the Pentium 4, which my post was
obviously not addressing.

Now it seems like you've missed the other key point, namely that cmov produces a
branch uOp, so unless I'm being especially dense, it CAN be mispredicted, just
like any other branch.

Of course, this is contrary to the point of a conditional move instruction. My
only comment to that is that Intel must have decided to add the conditional move
after they were done designing the relevant parts of the core. The decision to
add the instruction makes sense for forward-compatibility, i.e., "use this
instruction and you will see a performance improvement with it on later
processors."

-Tom



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.