Author: Dann Corbit
Date: 15:56:56 03/11/04
Go up one level in this thread
On March 11, 2004 at 10:59:28, Richard Pijl wrote:
>On March 11, 2004 at 10:31:22, milix wrote:
>
>>Hello to everyone.
>>I'm trying to optimize various parts of my bitboard engine and I found an
>>optimization for my popcount() routine. I was using the routine:
>>
>>int popcount_1(const bitboard bb) {
>> register int cnt=0;
>> register bitboard a=bb;
>> while (a) {
>> cnt += bytecount[a & 0xff];
>> a >>= 8;
>> }
>> return cnt;
>>}
>
>Here you are shifting a 64 bits integer on a 32 bits architecture. Basically
>you're doing a lot of working shifting bits from one word to the other.
>Richard.
Also, there is a needless double copy of the passed in bitboard.
In C (and C++ unless you use references) everything is passed by value.
Hence, a const on a non-pointer paramter is always useless. You are already
working on a copy. Also, it is pointless to decorate your programs with the
register hint. This routine does the same thing as his first routine but
faster:
int popcount_1(bitboard a) {
int cnt=0;
while (a) {
cnt += bytecount[a & 0xff];
a >>= 8;
}
return cnt;
}
There is no damage to 'bitboard a' in the calling program, since it is passed by
value (a copy is made).
An aside on the register hint:
Don't do it. Not only does it not make your program faster (unless you have the
world's lamest compiler) but it actually hinders your program.
For instance, this is not legal C or C++:
register int foo;
int *bar = &foo; /* You cannot take the address of a register... */
>>where bytecount is a looup table for byte counts.
>>
>>I changed this routine a bit to
>>
>>int popcount_2(const bitboard bb) {
>> register int cnt=0;
>> unsigned long a1 = *((unsigned long*)&bb);
>> unsigned long a2 = *(((unsigned long*)&bb)+1);
>> while (a1) {
>> cnt += bytecount[a1 & 0xff];
>> a1 >>= 8;
>> }
>> while (a2) {
>> cnt += bytecount[a2 & 0xff];
>> a2 >>= 8;
>> }
>> return cnt;
>>}
>>
>>and the routine popcount_2 was 2x times faster than popcount_1. It is even
>>faster than the assembly routine provided by AMD for popcounts. I tested the
>>routines using 1000 randomly generated bitboards.
>>Do I miss something here or it is an improvement? Have anyone else tried a
>>similar idea?
>>
>>My computer specs are:
>>AMD Athlon XP 1600+
>>512 MB RAM
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.