Author: Dieter Buerssner
Date: 17:11:57 06/25/02
Go up one level in this thread
On June 25, 2002 at 17:19:20, Robert Hyatt wrote: >On June 25, 2002 at 17:09:42, Dieter Buerssner wrote: > >>On June 25, 2002 at 16:35:26, Gian-Carlo Pascutto wrote: >> >>>if (board[E6] == wpawn) >>> >>>1 load >>>1 compare >>> >>>if (WhitePawns & Mask[e6]) >>> >>>2 64-bit loads >>>1 64-bit and >>>1 64-bit compare >> >>Hidden in some macros one could write >> >> if ((WhitePawns.low32 & Mask[e6].low32) >> | (WhitePawns.high32 & Mask[e6].high32)) >> >>4 32 bit loads, >>2 32 bit ands >>1 32 bit or and branch based on the flags. >> >>An intelligent compiler could produce better code with something like >> >>if (WhitePawns & (1ULL << E6)) >> >>1 32-bit load, >>1 32-bit and, branch based on the flags. >> >>Of course, this optimization cannot work anymore, when E6 is "sq". > > >Actually it can. while waiting on the load (several clocks) you could >do the following, which we did in Cray Blitz all over the place: > >(pseudo-code rather than Cray assembly): I think, it was clear, that my code snippet and operation counts was meant in the context of a 32 bit environment. There the compile time constant 1ULL<<E6 (due to micro-optimization on code level, or due to some clever compiler) will enable you, that only operations on one 32 part of the 64 bit bitboard are needed. When E6 is not a constant anymore, both parts must be processed. Of course, many operations may be done in parallel. However I doubt, that 2 32 bit memory loads, a shift over 2 32 bit words, ... can ever be as fast as a single 32 bit load and an and (no shift). For my program (I use bitboards for pawns and not much more), I can expect that the memory loads are from L1-cache in many cases. There will not be much waiting time for the memory load. Regards, Dieter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.