Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: question about branchless code

Author: Uri Blass

Date: 08:03:20 02/27/03

Go up one level in this thread


On February 27, 2003 at 10:14:35, Matt Taylor wrote:

>On February 27, 2003 at 07:44:19, Uri Blass wrote:
>
>>On February 27, 2003 at 05:23:25, Tony Werten wrote:
>>
>>>On February 26, 2003 at 14:04:16, Uri Blass wrote:
>>>
>>>>I had the following code in my program
>>>>
>>>>do
>>>>{
>>>>...
>>>>if (info[target]==EMPTA)
>>>>  target+=8;
>>>>else
>>>>  target=64;
>>>>}
>>>>while (target<64)
>>>
>>>Less branches:
>>>
>>>do
>>>{
>>>...
>>>  tmp:=(info[target]!=EMPTY);  // maybe typecast to int ?
>>>  target+=(tmp*256)+8;
>>>} while (target<64);
>>>
>>>if (target>256) target=64;
>>>
>>>
>>>But I'm not sure if this works in C. (In pascal compares are garantied to return
>>>0 or 1 )
>>
>>It works in C(except the fact that I need to use a varaible for tmp)
>>
>>I even do not need the last if (target>256) target==64 because target is not
>>used after the loop but I am not sure if it makes me faster.
>>
>>I have many similiar loops and by comparing times by changing one loop
>>I find that it is even slightly slower.
>>
>>I used a global varaible for tmp because if I use a local caraible that I need
>>to define at the beginning of the function then it means that I need to change
>>all the similiar functions in order to practically use it.
>>
>>Uri
>
>Using a variable is not necessarily a bad thing. Many variables get optimized
>away. Even if you use the computation as an intermediate, there is no guarantee
>that the compiler won't store it to memory.
>
>One common technique for optimizing variable usage is to convert code into a
>so-called IL ("Intermediate Language"). The IL supports an basically infinite
>number of registers. Each calculation you perform goes into one of these
>registers, and it's never modified afterward. The compiler then performs
>liveness analysis on the IL to determine when it can throw away computations.
>Other things can affect optimization (such as whether or not you take the
>address of the variable), but in this case and many others, you are no worse off
>to use a variable over an intermediate computation.
>
>If your loops are all similar, it means you should probably put one copy in a
>function.
 Global variables are discouraged because they cause thread re-entry
>issues. (i.e. you will have many problems if you decide to take advantage of
>parallel processing.)
>
>Again, the speed difference you're talking about is very minimal. Concentrate on
>a good design first and worry about speed later. You might address speed in your
>design, but optimization is an iterative, topological process. First you
>optimize your algorithms (i.e. bitboards are faster in some cases, arrays are
>faster in others, etc.) Next you can optimize memory access. If you want to
>optimize to the clock cycle, you can then work on individual loops like this.
>However, you want to use a profiler to do it -- there is no sense in optimizing
>an infrequently executed loop when there are a dozen others that could use
>optimizing. Speed often goes hand-in-hand with a good design.
>
>-Matt

I know that I can get bigger speed improvements from changing the design and
improving the algorithm but this process is longer than changing few lines
and I like to see first what I can gain by changing few lines.

I do not think that there is a contradiction between better design and
these speed improvements.

I also found some small speed improvement that I can describe as improvement of
the algorithm

My initial loop was the following loop(I include also the line before the loop.

target+=8;
do
{
   directsee[target]&=~131074;
   if (info[target]==EMPTA)
     target+=8;
   else
     target=64;
}
while (target<64);

better is

do
{
  target+=8;
  directsee[target]&=~131074;
  if (info[target]!=EMPTA)
    break;
}
while (target<56)

I can say that it is improvement of the algorithm because I save target+=8
that is not relevant.

Note that the arrays that I used in this post are not going to be changed in the
better design.

The name EMPTA and not EMPTY was used because I have different constants for
pieces and for information of the square.

I have PAWN=0,KNIGHT=1,EMPTY=6

I also have

info[squarre]=
whitepawn=0,whiteknight=1,...blackpawn=8,...
EMPTA=22

The idea is that I want to find by info[square] the piece and the color by
simple defines so I can do it by defines like defining piece(square)
to be info[square]&7


Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.