Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Simple optimization question

Author: Gerd Isenberg

Date: 07:05:27 01/09/04

Go up one level in this thread


On January 09, 2004 at 09:44:03, Robert Hyatt wrote:

>On January 09, 2004 at 09:01:23, Gerd Isenberg wrote:
>
>>On January 09, 2004 at 06:46:00, Tord Romstad wrote:
>>
>>>By reading this forum, I've understood that "if" statements are considered
>>>evil and that it is often a good idea to remove them if it is possible.  Suppose
>>>that I have code which looks like this:
>>>
>>>if(x) y += 20;
>>>
>>>Would it then be advantageous to rewrite the code like this?
>>>
>>>y += (!(!x))*20;
>>>
>>>In my evaluation function, I have a lot of conditionals which could be avoided
>>>by
>>>using tricks similar to the one above, but before doing it I would like to make
>>>sure it is really a good idea.  After all, the first form above is much more
>>>readable.
>>>
>>>Tord
>>
>>Hi Tord,
>>
>>in general it is a good idea to avoid branches with todays super pipelined
>>processors, specially if the branch-body is small and the condition is "random"
>>and difficult to predict for the processor.
>>
>>The drawback with y += (x!=0)*K is the need of an additional register and more
>>instructions, so it only pays off, if the register pressure is rather low, the
>>condition is random and the target is already loaded inside a register:
>>
>>if (eax > ebx) ecx += 20;
>>
>>   cmp  eax, ebx
>>   jle  l1
>>   add  ecx, 20
>>l1:
>>
>>ecx += (eax > ebx) * 20;
>>
>>   xor  edx, edx ; zero edx, because set instruction use byte register only
>>   cmp  eax, ebx
>>   setg dl       ; edx := (eax > ebx)
>>   shl  edx, 2   ; * 4
>>   lea  edx,[edx+edx*4] ; * 5
>>   add  ecx, edx
>>
>>Depending on the constant, the multiplication may done by shift,add,lea
>>instructions, but with some constants (or even variables) it is faster to avoid
>>the "mul" and to use "and" with a boolean mask (-true=>0xffffffff,-false=>0):
>>
>>y += -(x != 0) & z;
>
>Won't _that_ have a branch in it as well?  Or else at least a long
>pipeline stall where you want the result of the comparison before you
>can use it (ie some sort of setxx instruction as above)?

With simple 32-bit int conditions i found it always branchless with set
instruction (MSVC6). With register dependencies you are right, but probably
less expensive as 50% miss predictions, specially considering out of order
execution of other instructions.

>
>
>>
>>The question is whether those micro-optimizations should better be done by the
>>compiler. Anyway, i use this tricks rarely here and there with some slight
>>speedup.
>>
>>Cheers,
>>Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.