Author: Matt Taylor
Date: 23:28:43 01/21/03
Go up one level in this thread
On January 21, 2003 at 07:38:36, David Rasmussen wrote: > >What I meant was hoping for was x86 (Athlon XP, primarily) functions for _all_ >or most of the below simple inline functions, since it seems that MSVC and Intel >generates horrible code (function calls for shifting etc.!) for these >fundamental functions. They still manage to be a lot faster than gcc, Borland >and Sun for some reasons. I have been working on it, but I have other priorities at the moment, which is why I haven't posted anything. <snip> >>>INLINE BitBoard RankMask(Rank rank) { return rankMask[rank]; } > >rankmask is (as expected) a mask of all 1's at the relevant rank, so it's >11111111 shifted rank*8 times to the left, if rank is zero-indexed. This can >probably be done faster than a memory lookup too, if it's not put in the hands >of MSVC and Intel, which would probably just do the shift with a function call. >So, again: A faster assembly function should be possible. Please help, assembly >programmers! Definitely. I shifted (rank & 3) left by 3, then used that to shift the mask 0x000000FF left. The bit rank & 4 determines whether it becomes the low part or the high part. The rest of the result is zero. >>>INLINE BitBoard FileMask(File file) { return fileMask[file]; } > >I don't know if this can be done fast. It's two shifts of two 32-bit words. >Help? The file can be done in a similar manner to rank. I've already implemented this as a 32-bit shift with copy to form the 64-bit result. The mask 0x01010101 can be shifted left by the file number to obtain half of the result. Due to the symmetry here, you can treat that as the lower half and copy the lower half to the upper half. <snip> >>>INLINE Square Rotate45Left(Square square) { return rotate45Left[square]; } >>>INLINE Square Rotate45Right(Square square) { return rotate45Right[square]; } >>>INLINE Square Rotate90Left(Square square) { return rotate90Left[square]; } >>>INLINE Square UnRotate45Left(Square square) { return unrotate45Left[square]; } > >Maybe something can be done for these too, that is faster than a memory lookup? This is about where I stopped. I gave it a little thought, but I spent most of my time working on the bit scan and popcount routines. I have not gotten around to timing them yet. >In general, I would like assembly functions for all of these inline functions >above and below, that are faster than their originals on Intel and MSVC. > >Pretty please? <snip> Guaranteed! -Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.