Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: question about optimizing loops

Author: Gerd Isenberg

Date: 07:39:53 08/23/03

Go up one level in this thread


On August 23, 2003 at 07:07:25, Uri Blass wrote:
<snip>

>>>for (k=0;k<weight[i];k++)
>>>f(power[i][k]);
>>>
>>>Uri
>>
>>I see, but instead of saving "i" (may be only a register/register mov), you have
>>an additional [256][8] table. Anyway, if this is a hotspot and function "f" is
>>cheap you may consider loop unrolling, even with inlined "f".
>
>I do not know
>practically I do not have f as a function but as a list of 3 commands so my
>profiler does not tell me exactly how much time I use for it.
>
>I only know that the lines are inside functions that take significant time
>(near 20% of all the time that the program runs).
>

Ok, that's worth to optimize.


>
> If the order of
>>"k" doesn't matter, in that way:
>>
>>switch (weight[i])
>>{
>>case 8: f(power[i][7]);
>>case 7: f(power[i][6]);
>>case 6: f(power[i][5]);
>>case 5: f(power[i][4]);
>>case 4: f(power[i][3]);
>>case 3: f(power[i][2]);
>>case 2: f(power[i][1]);
>>case 1: f(power[i][0]);
>>case 0: break;
>>}
>>
>>with MSC you may add a default case with assume(0).
>>
>>Gerd
>
>Thanks
>I may try it.
>
>I may  save case 0 and case 1 and more cases when I know that they can never
>happen
>
>the number of directions that king can go in an empty board is 8,5 or 3 and
>there are no more possibilities.
>
>The number of directions that knight can go in an empty board(calculated
>manually) is only 2,3,4,6,8 if I have no mistake.
>
>Uri

That only matters if the switch is implemeted with a "if else if else" chain,
but optimizing compiler like MSVC will use a jmp-table here.

I wonder whether your algorithm's runtime is dependend on the numbers of
directions at all. If you have all 8 directions, is there no way to combine the
8 steps partially in some way, e.g. by combining some ors?

f(power[i][7]);
f(power[i][6]);
f(power[i][5]);
f(power[i][4]);
f(power[i][3]);
f(power[i][2]);
f(power[i][1]);
f(power[i][0]);
==> f8() // i is 255

f(power[i][6]);
f(power[i][5]);
f(power[i][4]);
f(power[i][3]);
f(power[i][2]);
f(power[i][1]);
f(power[i][0]);
==> f7(i)
....

I would really try a switch256 with a small source oode generator.
It simply avoids the two lookups and the compiler may have a lot to optimize.

Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.