Author: Gerd Isenberg
Date: 09:41:04 12/13/05
Go up one level in this thread
On December 13, 2005 at 10:36:19, Chrilly Donninger wrote:
>On December 12, 2005 at 15:31:07, Gerd Isenberg wrote:
>
>>of course also saving the bestscore assignment in that case:
>>
>>if (val > bestscore)
>>{
>> if( val >= beta )
>> return val; // Actually goto return-sequence.
>> bestscore = val;
>> if(val > alpha)
>> alpha = val;
>>}
>>
>>or what about improving branch-prediction in some way:
>>
>>if( val >= beta )
>> return val; // Actually goto return-sequence.
>>if (val > bestscore)
>>{ // most likely not taken
>> bestscore = val;
>> if(val > alpha)
>> alpha = val;
>>}
>
>
>The optimizing compilers are shuffling the statements around. Sometimes they
>implement some very clever tricks to avoid a jumpt at all. I have not seen this
>from gcc output, put Visual-C and especially the Intel compiler do amazing
>tricks to avoid jumps.
>As a C-programmer one has no real control. I think the only way is to run the
>Code through VTune and look on the reprots.
>As a rule of thumb, the condition is usually reversed by the compiler.
>
>Chrilly
Yes, there are a lot of techniques to avoid branches.
Otoh branch prediction heuristics of todays processors are really sophisticated,
and most often - if no "really" random pattern occur -
and there are not too many branches that the btb got trashed, conditional
branches are still fine and produce most often shortes/fastest code.
We had some discussions here recently about that - a brief summary:
First to mention is cmovxx.
Then there are all kind of "boolean" multiplications by anding a value with a
{0,-1}-mask. One trick, often seen in msvc is using setCC with a formerly
cleared 32-bit register:
cmp/test... ; set some flags
setCC ; 0,1
neg ; 0,-1
cmp/test...; set some flags
setCC ; 0,1
dec ; -1,0
With carry flag one has sbb reg32,reg32 to build the mask.
With sign flag (eg. after a subtraction), one may use sar 31 or cdq to build a
{0,-1}-mask instead of setcc, neg/dec.
With some care and knowledge about reduced value-ranges, where the compiler is
ususally not aware of, the setCC-Trick and specially the sign-trick may also be
applied at C-level to force branchless code:
1) if ( booleanCondition ) x += y;
-> x += y & -( booleanCondition );
-> x += y & (!booleanCondition - 1);
2) if (depth < 0) fm = fm1; else fm = fm2;
-> fm = fm2 +((depth>>31) & (fm1-fm2));
3) fm = depth < -1 ? fm1 : depth < 0 ? fm2 : fm3;
-> fm = fm3 + ( (depth >>31) & (fm2-fm3))
+ (((depth+1)>>31) & (fm1-fm2));
// signed max/min/abs:
max(x,y) := x - ((x-y) & ((x-y)>>31));
min(x,y) := y + ((x-y) & ((x-y)>>31));
abs(x) := ( x ^ (x>>31) ) - (x>>31); // patented by sun!
abs(x) := ( x + (x>>31) ) ^ (x>>31);
// abs with sign extension via cdq:
int abs(int x)
{
int m = (int)( (unsigned __int64)((__int64)x) >> 32 );
return (x + m) ^ m;
}
// msvc 6.0
?abs@@YIHH@Z PROC NEAR ; abs, COMDAT
00000 8b c1 mov eax, ecx
00002 99 cdq
00003 8d 04 0a lea eax, [edx+ecx]
00006 33 c2 xor eax, edx
00008 c3 ret 0
?abs@@YIHH@Z ENDP ; abs
For another branchless domain, a SIMD-wise bool[64]*byte[64] dot-prduct, you may
have a look the ccc-shadow archives:
http://hornid.com/cgi-bin/ccc/topic_show.pl?pid=399116;hl=#pid399116
For the val > bestScore,alpha and >= beta case, i prefere not to fool the
compiler. If val >= beta is not true, val > alpha becomes never true with
nullwindows. Therefor i don't think a branchless alpha = max(val,alpha) makes
sense here:
if (val > bestscore)
{
if( val >= beta )
return val; // Actually goto return-sequence.
bestscore = val;
if(val > alpha)
alpha = val; // not taken if beta == alpha + 1
}
Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.