Author: Dann Corbit
Date: 15:56:56 03/11/04
Go up one level in this thread
On March 11, 2004 at 10:59:28, Richard Pijl wrote: >On March 11, 2004 at 10:31:22, milix wrote: > >>Hello to everyone. >>I'm trying to optimize various parts of my bitboard engine and I found an >>optimization for my popcount() routine. I was using the routine: >> >>int popcount_1(const bitboard bb) { >> register int cnt=0; >> register bitboard a=bb; >> while (a) { >> cnt += bytecount[a & 0xff]; >> a >>= 8; >> } >> return cnt; >>} > >Here you are shifting a 64 bits integer on a 32 bits architecture. Basically >you're doing a lot of working shifting bits from one word to the other. >Richard. Also, there is a needless double copy of the passed in bitboard. In C (and C++ unless you use references) everything is passed by value. Hence, a const on a non-pointer paramter is always useless. You are already working on a copy. Also, it is pointless to decorate your programs with the register hint. This routine does the same thing as his first routine but faster: int popcount_1(bitboard a) { int cnt=0; while (a) { cnt += bytecount[a & 0xff]; a >>= 8; } return cnt; } There is no damage to 'bitboard a' in the calling program, since it is passed by value (a copy is made). An aside on the register hint: Don't do it. Not only does it not make your program faster (unless you have the world's lamest compiler) but it actually hinders your program. For instance, this is not legal C or C++: register int foo; int *bar = &foo; /* You cannot take the address of a register... */ >>where bytecount is a looup table for byte counts. >> >>I changed this routine a bit to >> >>int popcount_2(const bitboard bb) { >> register int cnt=0; >> unsigned long a1 = *((unsigned long*)&bb); >> unsigned long a2 = *(((unsigned long*)&bb)+1); >> while (a1) { >> cnt += bytecount[a1 & 0xff]; >> a1 >>= 8; >> } >> while (a2) { >> cnt += bytecount[a2 & 0xff]; >> a2 >>= 8; >> } >> return cnt; >>} >> >>and the routine popcount_2 was 2x times faster than popcount_1. It is even >>faster than the assembly routine provided by AMD for popcounts. I tested the >>routines using 1000 randomly generated bitboards. >>Do I miss something here or it is an improvement? Have anyone else tried a >>similar idea? >> >>My computer specs are: >>AMD Athlon XP 1600+ >>512 MB RAM
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.