Author: Gerd Isenberg
Date: 04:58:12 06/02/04
Go up one level in this thread
On June 01, 2004 at 19:17:09, Gerd Isenberg wrote:
><snip>
>>>Hi Volker,
>>>
>>>Not sure, i guess current compiler are not aware of the RCL trick.
>>>So you may still use (inline) assembler for best perfomance.
>>>
>>>Something like this in C may be result in a chain of cmp, setnz and lea
>>>instructions:
>>>
>>>bitfield = (q8 != 0);
>>>bitfield <<= 1;
>>>bitfield += (q7 != 0);
>>>bitfield <<= 1;
>>>...
>>>bitfield += (q0 != 0);
>>>
>>>or more compact:
>>>
>>>bitfield = ...((((((q8!=0)<<1)+(q7!=0))<<1)+(q6!=0))<<1)...
>>>
>>>Btw. have you considered using bitboards?
>>>
>>>Cheers,
>>>Gerd
>>
>>No I use 0x88. Seems best to me to build incremental attack tables. Thanks for
>>the answer, yes adding a boolean may work ... testing ...
>>
>>Volker
>
>If you introduce more of that stuff from packing square metrics to setwise
>metrics, you may consider bitboards one time ;-)
>
>Anyway, even if you avoid (difficult or easy to predict?) branches with
>cmp-setCC/rcl/adc instructions the problem is a kind of stall, because setCC
>needs to wait for the flag outcome of cmp.
>
>Therefore to break dependencies it may be smarter to build "boolean" {-1,0}
>values inside a register instead of {true,false} carry flag. That makes the code
>much more able to work simultaniuosly.
>
>assuming 32-bit int, signed shift and positive q values:
>
>bitfield = ((-q8>>31) & 128)
> | ((-q7>>31) & 64)
> | ((-q6>>31) & 32)
> | ((-q5>>31) & 16)
> | ((-q4>>31) & 8)
> | ((-q3>>31) & 4)
> | ((-q2>>31) & 2)
> | ((-q1>>31) & 1);
>
>OTOH if the conditional jumps are most often predicted correctly...
>
>Cheers,
>Gerd
One additional note, if your q is signed char, the msc6 compiler is able to do a
nice otimization, only 8 negates,ands, a few ors but only one final shift!
Gerd
int pack2bits(signed char q[8])
{
unsigned int bitfield;
bitfield = ((-q[7]>>7) & 128)
| ((-q[6]>>7) & 64)
| ((-q[5]>>7) & 32)
| ((-q[4]>>7) & 16)
| ((-q[3]>>7) & 8)
| ((-q[2]>>7) & 4)
| ((-q[1]>>7) & 2)
| ((-q[0]>>7) & 1);
return bitfield;
}
PUBLIC ?pack2bits@@YAHQAC@Z ; pack2bits
; Function compile flags: /Ogty
; COMDAT ?pack2bits@@YAHQAC@Z
_TEXT SEGMENT
_q$ = 8
?pack2bits@@YAHQAC@Z PROC NEAR ; pack2bits, COMDAT
00000 8b 4c 24 04 mov ecx, DWORD PTR _q$[esp-4]
00004 0f be 41 07 movsx eax, BYTE PTR [ecx+7]
00008 0f be 51 06 movsx edx, BYTE PTR [ecx+6]
0000c f7 d8 neg eax
0000e f7 da neg edx
00010 25 00 40 00 00 and eax, 16384 ; 00004000H
00015 81 e2 00 20 00 00 and edx, 8192 ; 00002000H
0001b 0b c2 or eax, edx
0001d 0f be 51 05 movsx edx, BYTE PTR [ecx+5]
00021 f7 da neg edx
00023 81 e2 00 10 00 00 and edx, 4096 ; 00001000H
00029 0b c2 or eax, edx
0002b 0f be 51 04 movsx edx, BYTE PTR [ecx+4]
0002f f7 da neg edx
00031 81 e2 00 08 00 00 and edx, 2048 ; 00000800H
00037 0b c2 or eax, edx
00039 0f be 51 03 movsx edx, BYTE PTR [ecx+3]
0003d f7 da neg edx
0003f 81 e2 00 04 00 00 and edx, 1024 ; 00000400H
00045 0b c2 or eax, edx
00047 0f be 51 02 movsx edx, BYTE PTR [ecx+2]
0004b f7 da neg edx
0004d 81 e2 00 02 00 00 and edx, 512 ; 00000200H
00053 0b c2 or eax, edx
00055 0f be 51 01 movsx edx, BYTE PTR [ecx+1]
00059 8a 09 mov cl, BYTE PTR [ecx]
0005b f7 da neg edx
0005d 81 e2 00 01 00 00 and edx, 256 ; 00000100H
00063 f7 d9 neg ecx
00065 0b c2 or eax, edx
00067 81 e1 80 00 00 00 and ecx, 128 ; 00000080H
0006d 0b c1 or eax, ecx
0006f c1 f8 07 sar eax, 7
00072 c3 ret 0
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.