Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Question for the Crafty/Compiler experts (more)

Author: Robert Hyatt

Date: 09:32:40 02/20/04

Go up one level in this thread


On February 20, 2004 at 07:20:11, Dieter Buerssner wrote:

>On February 19, 2004 at 22:39:47, Robert Hyatt wrote:
>
>>On February 19, 2004 at 16:44:38, Dieter Buerssner wrote:
>>
>>>On February 19, 2004 at 16:16:23, Robert Hyatt wrote:
>>>
>>>>I looked at my old inline asm docs, and I found the following statement,
>>>>paraphrased:
>>>>
>>>>"you don't need to put gcc-allocated registers on the clobber list, it knows you
>>>>are fooling with them."
>>>
>>>That's news to me. I cannot find it in my gcc manual (and I know, that it did
>>>not work with earlier gcc versions).
>>>
>>>Some snippets of my gcc manual:
>>>
>>>---
>>>The compiler cannot check whether the operands have data types that are
>>>reasonable for the instruction being executed.  It does not parse the assembler
>>>instruction template and does not know what it means or even whether it is valid
>>>assembler input.
>>>
>>>[...]
>>>   Some instructions clobber specific hard registers.  To describe this,
>>                                       ^^^^^^^^^^^^^^^
>>
>>I am keying on that in your quote.  "hard register" as opposed to "dynamically
>>assigned register"...  I won't say that is a correct interpretation, of course.
>>The syntax is not horrible for inline asm, but the documentation leaves a lot to
>>be desired, clarity-wise...
>
>To me, it looks rather clear, that you must notify the compiler, if you modify
>an input operand. Later in the Gcc manula, it is also explicetly mentioned, that
>they are assumed as read only operands. It is also mentioned, that it does not
>try at all, to understand the inline assembler, it just puts in the real
>operands instead of the %0/1/2...
>
>A little test case. Your popcount slightly modified, so that the input will be
>used later (it might happen similarily in real code when inlining).
>
>
>int bob(unsigned long a, unsigned long b)
>{
>/*  r0=result, %1=tmp, %2=first input, %3=second input */
>  long      dummy, dummy2;
>
>asm("        xorl    %0, %0"                    "\n\t"
>    "        testl   %2, %2"                    "\n\t"
>    "        jz      2f"                        "\n\t"
>    "1:      leal    -1(%2), %1"                "\n\t"
>    "        incl    %0"                        "\n\t"
>    "        andl    %1, %2"                    "\n\t"
>    "        jnz     1b"                        "\n\t"
>    "2:      testl   %3, %3"                    "\n\t"
>    "        jz      4f"                        "\n\t"
>    "3:      leal    -1(%3), %1"                "\n\t"
>    "        incl    %0"                        "\n\t"
>    "        andl    %1, %3"                    "\n\t"
>    "        jnz     3b"                        "\n\t"
>    "4:"                                        "\n\t"
>  : "=&q" (dummy), "=&q" (dummy2)
>  : "q" (a), "q" (b)
>  : "cc");
>  return dummy+a+b;
>}
>
>/* Too lazy, to format nicely, also really untested */
>int dieter(unsigned long a, unsigned long b)
>{
>  int tmp, tmp2, n;
>  __asm__ __volatile__(
>   "movl %3, %1  \n\t"
>   "xorl %0, %0  \n\t"
>   "testl %1, %1  \n\t"
>   "je 1f  \n\t"
>   "0: incl %0  \n\t"
>   "leal -1(%1), %2  \n\t"
>   "andl %2, %1  \n\t"
>   "jne 0b  \n\t"
>   "1: movl %4, %1  \n\t"
>   "testl %1, %1  \n\t"
>   "je 3f  \n\t"
>   "2: incl %0  \n\t"
>   "leal -1(%1), %2  \n\t"
>   "andl %2, %1  \n\t"
>   "jne 2b  \n\t"
>   "3:"
>   : "=r&" (n), "=r&" (tmp), "=r&" (tmp2)
>   : "g" (a), "g" (b)
>   : "cc" /* Flags "condition code" changed */);
>  return n+a+b;
>}
>
>gcc -O -S produces (I added some ; comments):
>
>	.file	"bob.c"
>	.section .text
>	.p2align 1
>.globl _bob
>_bob:
>	pushl	%ebp
>	movl	%esp, %ebp
>	pushl	%ebx
>	movl	8(%ebp), %edx
>; a to edx
>	movl	12(%ebp), %ecx
>; b to ecx
>/APP
>	        xorl    %eax, %eax
>	        testl   %edx, %edx
>	        jz      2f
>	1:      leal    -1(%edx), %ebx
>	        incl    %eax
>	        andl    %ebx, %edx
>; edx modified
>	        jnz     1b
>	2:      testl   %ecx, %ecx
>	        jz      4f
>	3:      leal    -1(%ecx), %ebx
>	        incl    %eax
>	        andl    %ebx, %ecx
>; ecx modified
>	        jnz     3b
>	4:
>
>/NO_APP
>	addl	%edx, %eax
>; oops using edx again, thinking it is not modified
>	addl	%ecx, %eax
>; ditto ecx
>	popl	%ebx
>	popl	%ebp
>	ret
>	.p2align 1
>.globl _dieter
>_dieter:
>	pushl	%ebp
>	movl	%esp, %ebp
>	pushl	%esi
>	pushl	%ebx
>	movl	8(%ebp), %edx
>	movl	12(%ebp), %ecx
>/APP
>	movl %edx, %esi
>	xorl %eax, %eax
>	testl %esi, %esi
>	je 1f
>	0: incl %eax
>	leal -1(%esi), %ebx
>	andl %ebx, %esi
>	jne 0b
>	1: movl %ecx, %esi
>	testl %esi, %esi
>	je 3f
>	2: incl %eax
>	leal -1(%esi), %ebx
>	andl %ebx, %esi
>	jne 2b
>	3:
>/NO_APP
>; using unmodified a and b here
>	addl	%edx, %eax
>	addl	%ecx, %eax
>	popl	%ebx
>	popl	%esi
>	popl	%ebp
>	ret
>	.ident	"GCC: (GNU) 3.2"
>
>Regards,
>Dieter

OK... certainly looks like you are right.  I probably escaped because there are
so few registers, and there was no way to carry stuff thru the function call.
In fact, the upper and lower halves of the particular bitboard are probably not
used again, until long after the registers have been used for somethig else, so
things worked out ok.  I have changed the inline code to make sure it doesn't
break on simple tests, however...




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.