Author: Gerd Isenberg
Date: 15:23:33 01/09/04
Go up one level in this thread
On January 09, 2004 at 17:26:42, Dezhi Zhao wrote:
>On January 09, 2004 at 10:05:27, Gerd Isenberg wrote:
>
>>On January 09, 2004 at 09:44:03, Robert Hyatt wrote:
>>
>>>On January 09, 2004 at 09:01:23, Gerd Isenberg wrote:
>>>
>>>>On January 09, 2004 at 06:46:00, Tord Romstad wrote:
>>>>
>>>>>By reading this forum, I've understood that "if" statements are considered
>>>>>evil and that it is often a good idea to remove them if it is possible. Suppose
>>>>>that I have code which looks like this:
>>>>>
>>>>>if(x) y += 20;
>>>>>
>>>>>Would it then be advantageous to rewrite the code like this?
>>>>>
>>>>>y += (!(!x))*20;
>>>>>
>>>>>In my evaluation function, I have a lot of conditionals which could be avoided
>>>>>by
>>>>>using tricks similar to the one above, but before doing it I would like to make
>>>>>sure it is really a good idea. After all, the first form above is much more
>>>>>readable.
>>>>>
>>>>>Tord
>>>>
>>>>Hi Tord,
>>>>
>>>>in general it is a good idea to avoid branches with todays super pipelined
>>>>processors, specially if the branch-body is small and the condition is "random"
>>>>and difficult to predict for the processor.
>>>>
>>>>The drawback with y += (x!=0)*K is the need of an additional register and more
>>>>instructions, so it only pays off, if the register pressure is rather low, the
>>>>condition is random and the target is already loaded inside a register:
>>>>
>>>>if (eax > ebx) ecx += 20;
>>>>
>>>> cmp eax, ebx
>>>> jle l1
>>>> add ecx, 20
>>>>l1:
>>>>
>>>>ecx += (eax > ebx) * 20;
>>>>
>>>> xor edx, edx ; zero edx, because set instruction use byte register only
>>>> cmp eax, ebx
>>>> setg dl ; edx := (eax > ebx)
>>>> shl edx, 2 ; * 4
>>>> lea edx,[edx+edx*4] ; * 5
>>>> add ecx, edx
>>>>
>>>>Depending on the constant, the multiplication may done by shift,add,lea
>>>>instructions, but with some constants (or even variables) it is faster to avoid
>>>>the "mul" and to use "and" with a boolean mask (-true=>0xffffffff,-false=>0):
>>>>
>>>>y += -(x != 0) & z;
>>>
>>>Won't _that_ have a branch in it as well? Or else at least a long
>>>pipeline stall where you want the result of the comparison before you
>>>can use it (ie some sort of setxx instruction as above)?
>>
>>With simple 32-bit int conditions i found it always branchless with set
>>instruction (MSVC6). With register dependencies you are right, but probably
>>less expensive as 50% miss predictions, specially considering out of order
>>execution of other instructions.
>>
>
>one more point here. Nowadays the adc, sbb instructions that are often used to
>impleament the trick are quite slow. setcond is is not too bad, but not as good
>as before. And of course, the trick often incress register pressure.
>So the bottom line is to do a test in your application:)
Sure, but why are adc/sbb slow? On P4? Because of flag dependency?
AMD reports one cycle latency like add/sub. My compiler produces a setCC
variation:
int foo(int a, int b, int c)
{
if ( a != 0 ) b += c;
b += ((a == 0)-1) & c;
__asm // for code size reasons
{
xor edx, edx
cmp ebx, 1 ; carry (borrow) := (eax == 0)
adc edx, -1 ; edx := (eax == 0)-1
and edx, ecx
add eax, edx ; +=
}
return b;
}
; Function compile flags: /Ogtay
_TEXT SEGMENT
_a$ = 8
_b$ = 12
_c$ = 16
?foo@@YAHHHH@Z PROC NEAR ; foo
; Line 1481
01b60 8b 4c 24 04 mov ecx, DWORD PTR _a$[esp-4]
01b64 85 c9 test ecx, ecx
01b66 56 push esi
01b67 8b 74 24 10 mov esi, DWORD PTR _c$[esp]
01b6b 74 04 je SHORT $L43006
01b6d 01 74 24 0c add DWORD PTR _b$[esp], esi
$L43006:
01b71 33 d2 xor edx, edx
01b73 83 fb 01 cmp ebx, 1
01b76 83 d2 ff adc edx, -1
01b79 23 d1 and edx, ecx
01b7b 03 c2 add eax, edx
; Line 1484
01b7d 33 c0 xor eax, eax
01b7f 85 c9 test ecx, ecx
01b81 8b 4c 24 0c mov ecx, DWORD PTR _b$[esp]
01b85 0f 94 c0 sete al
01b88 48 dec eax
01b89 23 c6 and eax, esi
01b8b 03 c1 add eax, ecx
01b8d 5e pop esi
; Line 1495
01b8e c3 ret 0
?foo@@YAHHHH@Z ENDP ; foo
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.