Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: a question about speed

Author: Uri Blass
Date: 01:31:55 01/07/03
On January 06, 2003 at 22:11:11, Matt Taylor wrote:

>On January 06, 2003 at 19:49:10, Uri Blass wrote:
>
>>On January 06, 2003 at 18:55:22, Heiner Marxen wrote:
>>
>>>On January 03, 2003 at 20:26:44, Matt Taylor wrote:
>>>
>>>>On January 03, 2003 at 14:27:02, Ricardo Gibert wrote:
>>>>
>>>>>On January 03, 2003 at 14:09:56, Uri Blass wrote:
>>>>>
>>>>>>On January 03, 2003 at 14:04:23, Ricardo Gibert wrote:
>>>>>>
>>>>>>>On January 03, 2003 at 12:03:45, Uri Blass wrote:
>>>>>>>
>>>>>>>>code B is slightly faster than code A.
>>>>>>>>I know that side can get only 0 or 1(something that the compiler does not know)
>>>>>>>>and B is eqvivalent to A if you assume that side gets only 0 or 1.
>>>>>>>>
>>>>>>>>Is it possible to write a third code that will be even faster than B?
>>>>>>>>
>>>>>>>>I think that if the compiler can know that side is or 0 or 1 it can do B even
>>>>>>>>faster.
>>>>>>>>
>>>>>>>>code A:
>>>>>>>>
>>>>>>>>if (side==LIGHT)
>>>>>>>>{
>>>>>>>>  if (to>=56)
>>>>>>>>  {
>>>>>>>>    gen_promote(from,to,bits);
>>>>>>>>    return;
>>>>>>>>  }
>>>>>>>>}
>>>>>>>>else
>>>>>>>>{
>>>>>>>>  if (to<=7)
>>>>>>>>  {
>>>>>>>>    gen_promote(from,to,bits);
>>>>>>>>    return;
>>>>>>>>  }
>>>>>>>>}
>>>>>>>
>>>>>>>
>>>>>>>Try something like this:
>>>>>>>
>>>>>>>if ((to<=7) || (to>=56))
>>>>>
>>>>>
>>>>>You could also use "if (to-8 >= 48)" instead of the above. This assumes "to" is
>>>>>unsigned. It would be a shame for you to duplicate code just avoid 1 extra
>>>>>operation (the subtraction).
>>>><snip>
>>>>
>>>>That is awesome. I did not even think of that. That is probably the best code
>>>>possible to put there.
>>>>
>>>>-Matt
>>>
>>>That is a trick to do a range check, which I first saw in some code generated
>>>from a C compiler for a switch statement (loong ago).
>>>
>>>From the sources of Chest:
>>>
>>>/*
>>> * in_range() is general purpose range checking macro for integral types.
>>> * Note, that the lower bound is inclusive and the upper bound exclusive !
>>> * Note, that only the lower bound is expanded more than once.
>>> */
>>>#define in_range(x, lo, hi)                                     \
>>>        (  (((unsigned long)(x )) - ((unsigned long)(lo)))      \
>>>         < (((unsigned long)(hi)) - ((unsigned long)(lo)))      \
>>>        )
>>>
>>>Cheers,
>>>Heiner
>>
>>The question is if the compiler does not know to do that trick if
>>I tell it
>>if ((x>=lo)&&(x<hi))
>>
>>The point is that I am not sure if it is a good idea to spend energy about
>>tricks when tricks may in some cases even prevent the compiler to find even
>>better tricks.
>>The only trick that may be productive is trick that is based on knowledge that
>>the compiler cannot know but I think that even in this case a smart compiler
>>should ask the user productive question.
>>
>>Here is an example
>>
>>I had in my program the following code for finding the direction that the king
>>is attacked.
>>
>>temp=directsee[kingsquare[side]];
>>if (side==0)
>>		temp=(temp>>16)&65535;
>>	else
>>		temp=temp&65535;
>>
>>
>>I know that side cannot get values except 0 or 1.
>>
>>A smart compiler can run the program find that side always get only 0 or 1 and
>>ask me if side can get value that is not 0 or 1.
>>
>>Unfortunately I guess that compilers are not smart enough so they do not do it.
>>
>>The result is that I use the following trick
>>
>>#define dsee(target,side) ((directsee[(target)]>>((side)<<4))&65535)
>>temp=dsee([kingsquare[side]],1^side)
>>
>>People told me that I can use BOOL instead of int in C++ but I use only C files
>>and I do not like to change my files to .cpp because I read that C++ is not
>>exactly more than C and there are cases when the same code in C++ can give
>>different results than the same code in C.
>>
>>Uri
>
>Changing the extension to .cpp should have no impact on the performance of the
>code. Theoretically the C++ compiler is just as good at optimizing C code as the
>C compiler is. Because of the way compilers are built, this should hold true.
>
>No compiler I know of can afford to assume that side is either 0 or 1. Profiled
>or not, it means you would have to trigger -every- possible value to ensure
>correct code generation. It is theoretically possible to determine possible
>values for side, but I don't know of any compiler that currently does this.
>Ironic that this point comes up because I was discussing today the possibility
>of building a tool to do just that. :-)
>
>FYI, with full optimization VC generated two comparisons for your test.
>
>Also FYI, I can see much potential if you allow side to be 0 or -1. Assuming 0
>or -1 (instead of 1), your other code becomes:
>
>temp = directsee[kingsquare[side]];
>temp = (((temp >> 16) & ~side) | (temp & side)) & 65535;
>
>Not as readable, but much faster.
>
>Also, with regard to pure C optimization, I discovered a day or two ago that the
>AMD manual addresses optimization of -pure- C code. The following links to a
>post I made about it:
>http://www.talkchess.com/forums/1/message.html?275188
>
>-Matt

Thanks

I added this site to my favourites and I may look at it later.

The question is if I cannot use define that will give the same result
when I put directsee[target] instead of your temp in the defintion.

If I can do it then I do not see why the code is less readable.
The only possible problem is if the compiler does not do the same optimization
for your 2 commands and for the following single C command:

temp = (((directsee[kingsquare[side]] >> 16) & ~side) |
(directsee[kingsquare[side]] & side)) & 65535;

Uri
Re: a question about speed Matt Taylor 09:26:03 01/07/03
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.