Author: Eugene Nalimov
Date: 16:26:50 01/05/99
Go up one level in this thread
On January 05, 1999 at 19:24:04, Eugene Nalimov wrote: >On January 05, 1999 at 01:25:46, Dann Corbit wrote: > >>I would be curious to see timings of the assembly language variants versus this >>simple C doo-dad: >>#include <limits.h> >>#include <stdlib.h> >>#if CHAR_BIT == 8 >>static const char bits[256] = >>{ >> 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, >> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, >> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, >> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, >> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, >> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, >> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, >> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, >> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, >> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, >> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, >> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, >> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, >> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, >> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, >> 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8 >>}; >>#else >>PLEASE FIX ME. >>#endif >> >>/* >> ** Count bits in each byte >> ** >> ** by Auke Reitsma >> ** >> ** Torqued by D. Corbit >> ** This version makes no assumptions about integer size. >> ** If CHAR_BIT is not equal to 8, you will have to provide >> ** a corrected table (see above). >> */ >> >>int bit_count_bytes(unsigned long x) >>{ >> unsigned char * Ptr = (unsigned char *) &x; >> int Accu; >> switch (sizeof(x)) >> { >> case 4: >> Accu = bits[Ptr[0]] + bits[Ptr[1]] + bits[Ptr[2]] + bits[Ptr[3]]; >> break; >> case 8: >> Accu = bits[Ptr[0]] + bits[Ptr[1]] + bits[Ptr[2]] + bits[Ptr[3]] + >> bits[Ptr[4]] + bits[Ptr[5]] + bits[Ptr[6]] + bits[Ptr[7]]; >> break; >> default: >> { >> size_t i; >> Accu = 0; >> for (i = 0; i < sizeof(int); i++) >> Accu += bits[Ptr[i]]; >> } >> } >> return Accu; >>} > >Slightly modified routine, so it works for 8-bytes __int64, not for >4-bytes integers. Test input is 70 __int64 integers with 0-2 bytes >set. VC++ 6.0, PPro/200, NT4.0. Sorry, of course I meant "0-2 bits set". >Both routines inlined: assembly routine is 3.3 times faster. >Both routines are non-inlined: assembly routine is 2.6 times faster. > >Eugene Eugene
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.