Author: Eugene Nalimov
Date: 16:26:50 01/05/99
Go up one level in this thread
On January 05, 1999 at 19:24:04, Eugene Nalimov wrote:
>On January 05, 1999 at 01:25:46, Dann Corbit wrote:
>
>>I would be curious to see timings of the assembly language variants versus this
>>simple C doo-dad:
>>#include <limits.h>
>>#include <stdlib.h>
>>#if CHAR_BIT == 8
>>static const char bits[256] =
>>{
>> 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
>> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
>> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
>> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
>> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
>> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
>> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
>> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
>> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
>> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
>> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
>> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
>> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
>> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
>> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
>> 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8
>>};
>>#else
>>PLEASE FIX ME.
>>#endif
>>
>>/*
>> ** Count bits in each byte
>> **
>> ** by Auke Reitsma
>> **
>> ** Torqued by D. Corbit
>> ** This version makes no assumptions about integer size.
>> ** If CHAR_BIT is not equal to 8, you will have to provide
>> ** a corrected table (see above).
>> */
>>
>>int bit_count_bytes(unsigned long x)
>>{
>> unsigned char * Ptr = (unsigned char *) &x;
>> int Accu;
>> switch (sizeof(x))
>> {
>> case 4:
>> Accu = bits[Ptr[0]] + bits[Ptr[1]] + bits[Ptr[2]] + bits[Ptr[3]];
>> break;
>> case 8:
>> Accu = bits[Ptr[0]] + bits[Ptr[1]] + bits[Ptr[2]] + bits[Ptr[3]] +
>> bits[Ptr[4]] + bits[Ptr[5]] + bits[Ptr[6]] + bits[Ptr[7]];
>> break;
>> default:
>> {
>> size_t i;
>> Accu = 0;
>> for (i = 0; i < sizeof(int); i++)
>> Accu += bits[Ptr[i]];
>> }
>> }
>> return Accu;
>>}
>
>Slightly modified routine, so it works for 8-bytes __int64, not for
>4-bytes integers. Test input is 70 __int64 integers with 0-2 bytes
>set. VC++ 6.0, PPro/200, NT4.0.
Sorry, of course I meant "0-2 bits set".
>Both routines inlined: assembly routine is 3.3 times faster.
>Both routines are non-inlined: assembly routine is 2.6 times faster.
>
>Eugene
Eugene
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.