Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Simple optimization question

Author: Gerd Isenberg
Date: 11:39:26 01/09/04
On January 09, 2004 at 09:44:03, Robert Hyatt wrote:

>On January 09, 2004 at 09:01:23, Gerd Isenberg wrote:
>
>>On January 09, 2004 at 06:46:00, Tord Romstad wrote:
>>
>>>By reading this forum, I've understood that "if" statements are considered
>>>evil and that it is often a good idea to remove them if it is possible.  Suppose
>>>that I have code which looks like this:
>>>
>>>if(x) y += 20;
>>>
>>>Would it then be advantageous to rewrite the code like this?
>>>
>>>y += (!(!x))*20;
>>>
>>>In my evaluation function, I have a lot of conditionals which could be avoided
>>>by
>>>using tricks similar to the one above, but before doing it I would like to make
>>>sure it is really a good idea.  After all, the first form above is much more
>>>readable.
>>>
>>>Tord
>>
>>Hi Tord,
>>
>>in general it is a good idea to avoid branches with todays super pipelined
>>processors, specially if the branch-body is small and the condition is "random"
>>and difficult to predict for the processor.
>>
>>The drawback with y += (x!=0)*K is the need of an additional register and more
>>instructions, so it only pays off, if the register pressure is rather low, the
>>condition is random and the target is already loaded inside a register:
>>
>>if (eax > ebx) ecx += 20;
>>
>>   cmp  eax, ebx
>>   jle  l1
>>   add  ecx, 20
>>l1:
>>
>>ecx += (eax > ebx) * 20;
>>
>>   xor  edx, edx ; zero edx, because set instruction use byte register only
>>   cmp  eax, ebx
>>   setg dl       ; edx := (eax > ebx)
>>   shl  edx, 2   ; * 4
>>   lea  edx,[edx+edx*4] ; * 5
>>   add  ecx, edx
>>
>>Depending on the constant, the multiplication may done by shift,add,lea
>>instructions, but with some constants (or even variables) it is faster to avoid
>>the "mul" and to use "and" with a boolean mask (-true=>0xffffffff,-false=>0):
>>
>>y += -(x != 0) & z;
>
>Won't _that_ have a branch in it as well?  Or else at least a long
>pipeline stall where you want the result of the comparison before you
>can use it (ie some sort of setxx instruction as above)?
>

What about this one (not tested):
A compare (sub) instruction to set a carry or unsigned overflow flag.
Either zero minus borrow, or minus one plus carry, to build up the -1 mask.

if ( eax != 0 ) ebx += ecx;

ebx += -(eax != 0)    & ecx;
ebx += ((eax == 0)-1) & ecx;


   cmp  eax, 1   ; carry (borrow) := (eax == 0)
   mov  edx, -1
   adc  edx, 0   ; edx := (eax == 0)-1
   ...
   and  edx, ecx
   ...
   add  ebx, edx ; +=

Are compiler able to produce such code for simple conditional 8/16/32/64-bit
adds, if a register is available. I imagine some kind of profile guided
optimization, where critical, often wrong predicted "very near" foreward
branches with conditionally skipped add/sub may be target of such branchless
optimizations ;-)

I guess similar instructions are available for other architectures as well.


>
>>
>>The question is whether those micro-optimizations should better be done by the
>>compiler. Anyway, i use this tricks rarely here and there with some slight
>>speedup.
>>
>>Cheers,
>>Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.