Author: Gerd Isenberg
Date: 11:39:26 01/09/04
Go up one level in this thread
On January 09, 2004 at 09:44:03, Robert Hyatt wrote: >On January 09, 2004 at 09:01:23, Gerd Isenberg wrote: > >>On January 09, 2004 at 06:46:00, Tord Romstad wrote: >> >>>By reading this forum, I've understood that "if" statements are considered >>>evil and that it is often a good idea to remove them if it is possible. Suppose >>>that I have code which looks like this: >>> >>>if(x) y += 20; >>> >>>Would it then be advantageous to rewrite the code like this? >>> >>>y += (!(!x))*20; >>> >>>In my evaluation function, I have a lot of conditionals which could be avoided >>>by >>>using tricks similar to the one above, but before doing it I would like to make >>>sure it is really a good idea. After all, the first form above is much more >>>readable. >>> >>>Tord >> >>Hi Tord, >> >>in general it is a good idea to avoid branches with todays super pipelined >>processors, specially if the branch-body is small and the condition is "random" >>and difficult to predict for the processor. >> >>The drawback with y += (x!=0)*K is the need of an additional register and more >>instructions, so it only pays off, if the register pressure is rather low, the >>condition is random and the target is already loaded inside a register: >> >>if (eax > ebx) ecx += 20; >> >> cmp eax, ebx >> jle l1 >> add ecx, 20 >>l1: >> >>ecx += (eax > ebx) * 20; >> >> xor edx, edx ; zero edx, because set instruction use byte register only >> cmp eax, ebx >> setg dl ; edx := (eax > ebx) >> shl edx, 2 ; * 4 >> lea edx,[edx+edx*4] ; * 5 >> add ecx, edx >> >>Depending on the constant, the multiplication may done by shift,add,lea >>instructions, but with some constants (or even variables) it is faster to avoid >>the "mul" and to use "and" with a boolean mask (-true=>0xffffffff,-false=>0): >> >>y += -(x != 0) & z; > >Won't _that_ have a branch in it as well? Or else at least a long >pipeline stall where you want the result of the comparison before you >can use it (ie some sort of setxx instruction as above)? > What about this one (not tested): A compare (sub) instruction to set a carry or unsigned overflow flag. Either zero minus borrow, or minus one plus carry, to build up the -1 mask. if ( eax != 0 ) ebx += ecx; ebx += -(eax != 0) & ecx; ebx += ((eax == 0)-1) & ecx; cmp eax, 1 ; carry (borrow) := (eax == 0) mov edx, -1 adc edx, 0 ; edx := (eax == 0)-1 ... and edx, ecx ... add ebx, edx ; += Are compiler able to produce such code for simple conditional 8/16/32/64-bit adds, if a register is available. I imagine some kind of profile guided optimization, where critical, often wrong predicted "very near" foreward branches with conditionally skipped add/sub may be target of such branchless optimizations ;-) I guess similar instructions are available for other architectures as well. > >> >>The question is whether those micro-optimizations should better be done by the >>compiler. Anyway, i use this tricks rarely here and there with some slight >>speedup. >> >>Cheers, >>Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.