Computer Chess Club Archives


Search

Terms

Messages

Subject: smart compiler

Author: Gerd Isenberg

Date: 04:58:12 06/02/04

Go up one level in this thread


On June 01, 2004 at 19:17:09, Gerd Isenberg wrote:

><snip>
>>>Hi Volker,
>>>
>>>Not sure, i guess current compiler are not aware of the RCL trick.
>>>So you may still use (inline) assembler for best perfomance.
>>>
>>>Something like this in C may be result in a chain of cmp, setnz and lea
>>>instructions:
>>>
>>>bitfield  = (q8 != 0);
>>>bitfield <<= 1;
>>>bitfield += (q7 != 0);
>>>bitfield <<= 1;
>>>...
>>>bitfield += (q0 != 0);
>>>
>>>or more compact:
>>>
>>>bitfield = ...((((((q8!=0)<<1)+(q7!=0))<<1)+(q6!=0))<<1)...
>>>
>>>Btw. have you considered using bitboards?
>>>
>>>Cheers,
>>>Gerd
>>
>>No I use 0x88. Seems best to me to build incremental attack tables. Thanks for
>>the answer, yes adding a boolean may work ... testing ...
>>
>>Volker
>
>If you introduce more of that stuff from packing square metrics to setwise
>metrics, you may consider bitboards one time ;-)
>
>Anyway, even if you avoid (difficult or easy to predict?) branches with
>cmp-setCC/rcl/adc instructions the problem is a kind of stall, because setCC
>needs to wait for the flag outcome of cmp.
>
>Therefore to break dependencies it may be smarter to build "boolean" {-1,0}
>values inside a register instead of {true,false} carry flag. That makes the code
>much more able to work simultaniuosly.
>
>assuming 32-bit int, signed shift and positive q values:
>
>bitfield = ((-q8>>31) & 128)
>         | ((-q7>>31) &  64)
>         | ((-q6>>31) &  32)
>         | ((-q5>>31) &  16)
>         | ((-q4>>31) &   8)
>         | ((-q3>>31) &   4)
>         | ((-q2>>31) &   2)
>         | ((-q1>>31) &   1);
>
>OTOH if the conditional jumps are most often predicted correctly...
>
>Cheers,
>Gerd

One additional note, if your q is signed char, the msc6 compiler is able to do a
nice otimization, only 8 negates,ands, a few ors but only one final shift!

Gerd


int pack2bits(signed char q[8])
{
	unsigned int bitfield;
	bitfield = ((-q[7]>>7) & 128)
		 | ((-q[6]>>7) &  64)
		 | ((-q[5]>>7) &  32)
		 | ((-q[4]>>7) &  16)
		 | ((-q[3]>>7) &   8)
		 | ((-q[2]>>7) &   4)
		 | ((-q[1]>>7) &   2)
		 | ((-q[0]>>7) &   1);
	return bitfield;
}

PUBLIC	?pack2bits@@YAHQAC@Z				; pack2bits
; Function compile flags: /Ogty
;	COMDAT ?pack2bits@@YAHQAC@Z
_TEXT	SEGMENT
_q$ = 8
?pack2bits@@YAHQAC@Z PROC NEAR				; pack2bits, COMDAT
  00000	8b 4c 24 04	  mov	 ecx, DWORD PTR _q$[esp-4]
  00004	0f be 41 07	  movsx	 eax, BYTE PTR [ecx+7]
  00008	0f be 51 06	  movsx	 edx, BYTE PTR [ecx+6]
  0000c	f7 d8		  neg	 eax
  0000e	f7 da		  neg	 edx
  00010	25 00 40 00 00	  and	 eax, 16384		; 00004000H
  00015	81 e2 00 20 00 00 and	 edx, 8192		; 00002000H
  0001b	0b c2		  or	 eax, edx
  0001d	0f be 51 05	  movsx	 edx, BYTE PTR [ecx+5]
  00021	f7 da		  neg	 edx
  00023	81 e2 00 10 00 00 and	 edx, 4096		; 00001000H
  00029	0b c2		  or	 eax, edx
  0002b	0f be 51 04	  movsx	 edx, BYTE PTR [ecx+4]
  0002f	f7 da		  neg	 edx
  00031	81 e2 00 08 00 00 and	 edx, 2048		; 00000800H
  00037	0b c2		  or	 eax, edx
  00039	0f be 51 03	  movsx	 edx, BYTE PTR [ecx+3]
  0003d	f7 da		  neg	 edx
  0003f	81 e2 00 04 00 00 and	 edx, 1024		; 00000400H
  00045	0b c2		  or	 eax, edx
  00047	0f be 51 02	  movsx	 edx, BYTE PTR [ecx+2]
  0004b	f7 da		  neg	 edx
  0004d	81 e2 00 02 00 00 and	 edx, 512		; 00000200H
  00053	0b c2		  or	 eax, edx
  00055	0f be 51 01	  movsx	 edx, BYTE PTR [ecx+1]
  00059	8a 09		  mov	 cl, BYTE PTR [ecx]
  0005b	f7 da		  neg	 edx
  0005d	81 e2 00 01 00 00 and	 edx, 256		; 00000100H
  00063	f7 d9		  neg	 ecx
  00065	0b c2		  or	 eax, edx
  00067	81 e1 80 00 00 00 and	 ecx, 128		; 00000080H
  0006d	0b c1		  or	 eax, ecx
  0006f	c1 f8 07	  sar	 eax, 7
  00072	c3		  ret	 0



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.