Author: Gerd Isenberg
Date: 07:39:53 08/23/03
Go up one level in this thread
On August 23, 2003 at 07:07:25, Uri Blass wrote:
<snip>
>>>for (k=0;k<weight[i];k++)
>>>f(power[i][k]);
>>>
>>>Uri
>>
>>I see, but instead of saving "i" (may be only a register/register mov), you have
>>an additional [256][8] table. Anyway, if this is a hotspot and function "f" is
>>cheap you may consider loop unrolling, even with inlined "f".
>
>I do not know
>practically I do not have f as a function but as a list of 3 commands so my
>profiler does not tell me exactly how much time I use for it.
>
>I only know that the lines are inside functions that take significant time
>(near 20% of all the time that the program runs).
>
Ok, that's worth to optimize.
>
> If the order of
>>"k" doesn't matter, in that way:
>>
>>switch (weight[i])
>>{
>>case 8: f(power[i][7]);
>>case 7: f(power[i][6]);
>>case 6: f(power[i][5]);
>>case 5: f(power[i][4]);
>>case 4: f(power[i][3]);
>>case 3: f(power[i][2]);
>>case 2: f(power[i][1]);
>>case 1: f(power[i][0]);
>>case 0: break;
>>}
>>
>>with MSC you may add a default case with assume(0).
>>
>>Gerd
>
>Thanks
>I may try it.
>
>I may save case 0 and case 1 and more cases when I know that they can never
>happen
>
>the number of directions that king can go in an empty board is 8,5 or 3 and
>there are no more possibilities.
>
>The number of directions that knight can go in an empty board(calculated
>manually) is only 2,3,4,6,8 if I have no mistake.
>
>Uri
That only matters if the switch is implemeted with a "if else if else" chain,
but optimizing compiler like MSVC will use a jmp-table here.
I wonder whether your algorithm's runtime is dependend on the numbers of
directions at all. If you have all 8 directions, is there no way to combine the
8 steps partially in some way, e.g. by combining some ors?
f(power[i][7]);
f(power[i][6]);
f(power[i][5]);
f(power[i][4]);
f(power[i][3]);
f(power[i][2]);
f(power[i][1]);
f(power[i][0]);
==> f8() // i is 255
f(power[i][6]);
f(power[i][5]);
f(power[i][4]);
f(power[i][3]);
f(power[i][2]);
f(power[i][1]);
f(power[i][0]);
==> f7(i)
....
I would really try a switch256 with a small source oode generator.
It simply avoids the two lookups and the compiler may have a lot to optimize.
Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.