Author: Dann Corbit
Date: 17:33:04 12/11/01
Go up one level in this thread
On December 11, 2001 at 20:27:37, Wylie Garvin wrote: >Hi, > > I've noticed some posts where people mention doing things like this: > > BSF ecx,dword ptr [foo+4] > ADD ecx,32 > BSF ecx,dword ptr [foo] > > On the assumption that undefined --> unchanged for PIII, PIV. While I don't >have any evidence to contradict that assumption, it seems like a bad idea to >rely on undefined behaviour. You must be sure that the function is never called with all zero bits. > I point out the conditional CMOVcc instructions which have been supported >since around the time of the PPro I think: > > BSF ecx,dword ptr [foo+4] > ADD ecx,32 > BSF eax,dword ptr [foo] > CMOVZ eax,ecx > > This technique will work correctly even if eax happens to be trashed by the >BSR instruction if no bits were set. > > Also I'd like to mention something that I use to speed up my attack >calculations with BSF/BSR...I don't know if it will help you C/C++ programmers, >but it's pretty cool in assembly. > > Notice that of the 8 rays originating in one half of the board, only 3 of them >can possibly cross over to the other side of the board. So you can reduce that >construction above to a single BSR/BSF in the other 5 cases. My assembly code >first decides which half of the board the source square is in, then individually >probes the ends of the ray. To ensure that rays are always terminated, before I >start my attack calculations I build a temporary bitboard from the "all pieces" >bitboard by ORing in the edges (0xFF818181,818181FF) and then clearing the bit >for the source square. Now I need about 5 BSF/BSRs for rook/queen attacks and 6 >for bishop/queen attacks. > > Also keep in mind that removing a conditional branch in the fashion shown >above requires doing the computation for *both* halves of the branch. In this >case it appears to be a win; but if extra computation is required to produce the >bitboard, the cost could exceed the cost of a conditional branch. In general, >if the branch is predictable (e.g. if the rays are almost always short, and so >almost always stay on the same side of the board) it's likely better not to >remove it. I have found that with the latest Intel compilers, the bit tricks no longer gain any speed. Not the MMX tricks, not the BSF/BSR -- nothing. The ANSI C is just as fast. Strike that -- faster.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.