Author: Matt Taylor
Date: 00:00:34 01/22/03
Go up one level in this thread
On January 21, 2003 at 17:53:52, Gerd Isenberg wrote: >On January 21, 2003 at 15:48:57, Sander de Zoete wrote: > >>The following instruction I found for the new architecture of Intel Chips >> ><snip BSWAP> >>It is only valid for 486 architecture. > >Hi Sander, >I think BSWAP is introduced with 386, but i'm not quite sure. >Pentium and Athlon are compatible. Like Dave said, new to 486. >>If this instruction can be used, it should be very easy to reverse Reverse >>BitBoards into Forward Boards again. Saving a lot of updating in make and unmake >>move and also be much faster using the bitboards for generating attack >>board for evaluation purposes, incheck checks etc. > >Yes, nice idea.... >But if you have bit 0..63 resp. byte 0..7 in one bitboard (two 32bit registers), >you have to swap bit n with bit 63-n. So you must swap byte n with 7-n instead >of 3-n. Good point... I think it could be done in 3-4 clocks with a series of bswaps and masking. >>The thing left to do is to reverse the bits per byte. >> >>I use the following trick, but it doesn't seem to work voor the value 9. Let me >>show you what I mean: (and ofcourse, can you help me?) >> >>In C code, it looks like this: >> >>typedef unsigned __int64 BITBOARD; >> >>void ReverseBitsPerByte(BITBOARD bitboard) >>{ >>// 1. Load the constant, k = 5555555555555555h >>BITBOARD k = 0x5555555555555555; >> >>// 2. x = [(x shl 1) and k] or [(x and k) shr 1] result is: EFCDAB8967452301 >>bitboard = ((bitboard<<1) & k) | ((bitboard & k)>>1 ); >>} >and >3. Load the constant, k = 3333333333333333h >4. shift 2 ... >and finally the nibble swap >5. Load the constant, k = 0f0f0f0f0f0f0f0fh >6. shift 4 ... > >Doing the byte swap n<->(7-n) before or after these byte mirroring doesn't >matter. BSWAP may be faster than shift (don't know) to get the higher 16 bits to >the lower half and to use some movs with byte registers later in some clever >way. On Athlon, bswap is DirectPath and 1 clock -- identical to other ALU ops like shift. >If i only count the mirror instructions: 6 shifts, 6 ands, 3 ors to reverse each >byte 64-bit wise, makes 15 64-bit instructions. > >KoggeStone also needs exactly 6 shifts, 6 ands, 3 ors (for file only 5 ands). >But it is possible to use the propagators for multiple sets of same piece kind >(if enaugh registers are available). > >You need KoggeStone for each direction but mirroring only for the reversed, so >it seems that the reversed bitboard approach on the fly is not as bad as i >thought initially and worth to try. >But the byte mirroring is not all of course. You need further instructions to >swap the bytes and even the x^(x-2) for each direction is not for free and >requires some additional shifts/ands for the none rank directions. > >Another point is register dependency. It would be a good idea to mirror two or >up to four bitboards simultaniously, like an efficient KoggeStone implementation >computes several directions in parallel. Definitely. >As far as i see, Reversed Bitboards may only generate attacks for a set of >multiple rooks in left/right directions (with SIMD byte arithmetics). For all >other directions it only works with masking one file or diagonal. IMHO the main >advantage of KoggeStone or fill-algorithms. Otherwise getting attack sets for >single sliding pieces - it's hard to beat rotated. > >Regards, >Gerd > ><snip> -Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.