Author: Dieter Buerssner
Date: 01:23:19 09/16/01
Go up one level in this thread
On September 15, 2001 at 22:22:18, Vincent Diepeveen wrote:
>On September 15, 2001 at 21:42:18, Dieter Buerssner wrote:
>
>>I don't agree. There are too many priciple things that just cannot be optimized,
>>as one would like. Especially when pointers come into play. If you take the
>>adress of an object at some point, many optimiatation possibilities are just
>>gone, because of alias issues. The same is true for the references pointed out
>>in the other post.
>
>Oh sorry, didn't realize JAVA was that a horrible language, thanks
>for the explanation!
Áctually, I had examples in C and C++ in mind, compared to say Fortran.
For example some inner loop of a matrix multiplication:
c[i][j] = 0;
for (k=0; k<N; k++)
c[i][j] += a[i][k] * b[k][j];
i and j are constant in the inner loop. Still a C compiler cannot easily keep
c[i][j] in a floating point register, because c[i][j] can change a or b, because
the matrices can overlap (Calculating the actulal adress of c[i][j] outside of
the loop, will be no problem for the C-optimizer). Fortran could do a much
better job in optimizing this. In C, a temporary variable will fix it (and
perhaps also the new C99 Standard keyword restricted). I am not saying or
thinking that Fortran is better. I just wanted to point out a difference.
Or even more primitive, val is a reference parameter:
void foo(int a[], int *valp)
{
int i;
for (i=0; i<N; i++)
a[i] = *valp;
}
Putting *valp into a register is not possible. An optimizer doing this would be
broken. In C, this example may look very artifical, in C++, it may look natural
in a bigger context when using a reference parameter. If one is aware of this, I
don't see a major problem. However, the statement in another post, that passing
parameters by reference is usually faster, I cannot understand (when the passed
parameters are no larger objects).
>>Actually, I found that gcc optimizes array code too well too often for x86 (not
>>for alpha or MIPS, when I tested last). Often this will need additional
>>variables for the pointers magicially used, where the indexed access is de facto
>>free often on x86, as Bruce has explained. Most of my numerical code runs faster
>>when optimized with -O with gcc, instead of optimizing it with -O2, where, at
>>the time I checked, all array code was converted to pointer code.
>
>diep NEVER has run faster with -O with gcc, always -O2 was tens of
>percents faster than -O.
>
>You are using inline assembly and bitboards, i remember Bob complaining
>about -O being faster too?
I am not using inline assembly for chess and almost never for numerical
programming. One exception is multi precision arithmetic, where a huge factor
can be won by using (inline)assembly. I only use few bitboards. Especially, I
need no Popcount, Firstone, etc., so I see no need for inline assembly.
>> if (!(a|b))
>> do this
>
>yep
>
>>will produce something like
>>
>> or eax, edx
>> jnz some_label
>
>>and does not have the disadvantages of your "+" method.
>
>i know
Allow me one question: When you know, why did you suggest the bogos method with
+? Just to confuse readers of your post? :-)
BTW. Neither method is guaranteed to work in ISO C for pointers. For example, it
is not allowed to add pointers. And I think, this is not allowed for good
reasons.
>>if (((a.word1&b.word1)||(a.word2&b.word2))
>this is exactly what i mean, a compiler being that smart that it
>can save a branch i must see first!
Well, in the case, that mostly the first half evaluates to != 0, the logical or
is faster. How should a normal compiler know this? Perhaps, some compilers, that
can use profiling statistics are smart enough, to figure it out.
Regards,
Dieter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.