Author: Anthony Cozzie
Date: 08:05:30 03/11/04
Go up one level in this thread
On March 11, 2004 at 10:31:22, milix wrote: >Hello to everyone. >I'm trying to optimize various parts of my bitboard engine and I found an >optimization for my popcount() routine. I was using the routine: > >int popcount_1(const bitboard bb) { > register int cnt=0; > register bitboard a=bb; > while (a) { > cnt += bytecount[a & 0xff]; > a >>= 8; > } > return cnt; >} > >where bytecount is a looup table for byte counts. > >I changed this routine a bit to > >int popcount_2(const bitboard bb) { > register int cnt=0; > unsigned long a1 = *((unsigned long*)&bb); > unsigned long a2 = *(((unsigned long*)&bb)+1); > while (a1) { > cnt += bytecount[a1 & 0xff]; > a1 >>= 8; > } > while (a2) { > cnt += bytecount[a2 & 0xff]; > a2 >>= 8; > } > return cnt; >} > >and the routine popcount_2 was 2x times faster than popcount_1. It is even >faster than the assembly routine provided by AMD for popcounts. I tested the >routines using 1000 randomly generated bitboards. >Do I miss something here or it is an improvement? Have anyone else tried a >similar idea? > >My computer specs are: >AMD Athlon XP 1600+ >512 MB RAM In my experience the fastest popcount for Zappa is the simple for(a = 0; b != 0; b &= b -1, a++) ; where a is the counter and b is the bitboard. This works because most bitboards that I call it on are sparsely populated. anthony
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.