Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: question about optimizing code

Author: Andrew Dados

Date: 12:09:16 07/07/03

Go up one level in this thread


On July 07, 2003 at 14:45:19, Gerd Isenberg wrote:

>On July 07, 2003 at 12:26:17, Andrew Dados wrote:
>
>>On July 06, 2003 at 23:25:00, Uri Blass wrote:
>>
>>>In movei I have a number between 0 and 255 and I want to do some functions for
>>>every 1 that it has in its binary representation(number of functions to
>>>implement is the same as the number of 1's in the number and there is a
>>>different function based on the value of the 1)
>>>
>>>I do not like to call functions because functions use local varaibles and
>>>generating a copy of the local varaibles only to forget them may take computer
>>>time and I also do not use the functions more than once.
>>>
>>>I thought to use switch(see the code later in this post) but the problem is that
>>>switch continue to do everything without break and I want to have switch that
>>>does not continue to do everything but does not quit and simply forget about the
>>>fact that I called switch.
>>>
>>>Here is the type of the code that I plan to have now.
>>>I do not like all the if (number&2) or if (number&4) and the question is if I
>>>can do it faster.
>>>
>>>
>>>i=smallestpower[number];
>>>switch(i)
>>>{
>>>   case 0:do list of commands
>>>          if (number==1) break
>>>          else
>>>          {
>>>            number-=1;
>>>            i=smallestpower[number];
>>>          }
>>>  case 1:if (number&2)
>>>         {
>>>           do list of commands
>>>           if (number==2) break;
>>>           else
>>>           {
>>>             number-=2;
>>>             i=smallestpower[number];
>>>           }
>>>         }
>>>  case 2:if (number&4)
>>>         {
>>>           do list of commands
>>>           if (number==4) break;
>>>           else
>>>           {
>>>             number-=4;
>>>             i=smallestpower[number];
>>>           }
>>>         }
>>>  case 3:if (number&8)
>>>         ...
>>>  ...
>>>  case 6:if (number&64)
>>>         ...
>>>  case 7://if you got there you know that number is 128 because you did not get
>>>out by break so you do not need to check it
>>>
>>>
>>>Uri
>>
>>boolean checking is way faster then jumps; in your implementation you get about
>>6 jumps in the code (case is a jump) since you have about 1.2 bit set on
>>average. I'll try to go down to below 4 jumps average:
>>
>>if i & (128+64){
>> if  i & 128 {}
>> if  i & 64  {}
>>if ! i | 63 goto done;
>>}
>>
>>if i & (32+16){
>> if  i & 32 {}
>> if  i & 16  {}
>>if ! i | 15 goto done;
>>}
>>
>>if i & (8+4){
>> if  i & 8 {}
>> if  i & 4  {}
>>}
>>
>>if i & (2+1){
>> if  i & 2 {}
>> if  i & 1  {}
>>}
>>
>>done:
>>
>>--------------
>>Since you have small number of average bits set this should be way faster then
>>case switch
>
> - no, see below.
>
>>
>>-Andrew-
>
>Hi Andrew,
>
>interesting asymmetric approach based on Uri's probability of the higher bits.
>
>Lets see, One bit set:
>bit 7,6 - 4 compares
>bit 5,4 - 1 + 4 = 5 compares
>bit 3,2 - 1 + 1 + 3 + 1 = 6 compares
>bit 1,0 - 1 + 1 + 1 + 3 = 6 compares
>
>Fine - and all seldom cases of multiple bits are handled too.
>That's the real advantage of your code - it don't acts like a switch or a "if
>else if" cascade with one mutual exclusive case. It behaves exactly like a
>sequence of eight if stamements, considering perfectly Uri's bit derivation.
>
>Thought about an outer if i & (128+64+32+8) for a further binary divide but it
>is worse here.
>
>The switch-statement per se should be superiour due to the consecutive cases
>from 0 to 8 - with the help of Uri's firstOne byte lookup. So only one indirect
>jump (eg. in assembler jmp [_jmptable + 4*eax]) - but all cases are exclusive
>and one must handle multiple set bits in an outer while-loop or Reinhard's
>individual goto_work do-while.
>
>So in best case Tim's while-switch approach requires 3 conditional or indirect
>jumps with one bit set (two while and one indirect switch jump, not considered
>range checking) but additional memory access and clearing the traversed bit.
>Reinhard's goto_work do-while approach needs one compare less and profits from
>set bit 7.
>
>So in average i guess Reinhard's  approach is fastest.
>
>But the fastest of course is the switch(256) approach, if code size doesn't
>matter and the jumptable is in cache ;-)
>
>Regards,
>Gerd

2 remarks:
Jump to memory reference takes 4 cycles,no branch prediction can be applied,
short jump takes 1 cycle; while you are counting compares I was counting jumps
(1-2 less then compare in each of your cases).

I guess simply benchmarking that code against Reinhards should solve the issue.

-Andrew-



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.