Author: Dann Corbit
Date: 03:44:08 01/22/03
Go up one level in this thread
On January 22, 2003 at 06:09:22, Dann Corbit wrote: >On January 22, 2003 at 05:33:20, Matt Taylor wrote: > >>On January 22, 2003 at 04:27:16, Dann Corbit wrote: >> >>>On January 22, 2003 at 03:29:05, Matt Taylor wrote: >>> >>>>On January 21, 2003 at 17:03:33, Dann Corbit wrote: >>>> >>>>>On January 21, 2003 at 15:48:57, Sander de Zoete wrote: >>>>> >>>>>>The following instruction I found for the new architecture of Intel Chips >>>>>> >>>>>>BSWAP—Byte Swap >>>>>> >>>>>>Description >>>>>>Reverses the byte order of a 32-bit (destination) register: >>>>>> >>>>>>Operation >>>>>>TEMP ¬ DEST >>>>>>DEST(7..0) ¬ TEMP(31..24) >>>>>>DEST(15..8) ¬ TEMP(23..16) >>>>>>DEST(23..16) ¬ TEMP(15..8) >>>>>>DEST(31..24) ¬ TEMP(7..0) >>>>>>Flags Affected >>>>>>None. >>>>>>Opcode Instruction Description >>>>>>0F C8+ rd BSWAP r32 Reverses the byte order of a 32-bit register. >>>>>> >>>>>>It is only valid for 486 architecture. >>>>>> >>>>>>If this instruction can be used, it should be very easy to reverse Reverse >>>>>>BitBoards into Forward Boards again. Saving a lot of updating in make and unmake >>>>>>move and also be much faster using the bitboards for generating attack >>>>>>board for evaluation purposes, incheck checks etc. >>>>>> >>>>>>The thing left to do is to reverse the bits per byte. >>>>>> >>>>>>I use the following trick, but it doesn't seem to work voor the value 9. Let me >>>>>>show you what I mean: (and ofcourse, can you help me?) >>>>>> >>>>>>In C code, it looks like this: >>>>>> >>>>>>typedef unsigned __int64 BITBOARD; >>>>>> >>>>>>void ReverseBitsPerByte(BITBOARD bitboard) >>>>>>{ >>>>>>// 1. Load the constant, k = 5555555555555555h >>>>>>BITBOARD k = 0x5555555555555555; >>>>>> >>>>>>// 2. x = [(x shl 1) and k] or [(x and k) shr 1] result is: EFCDAB8967452301 >>>>>>bitboard = ((bitboard<<1) & k) | ((bitboard & k)>>1 ); >>>>>>} >>>>>> >>>>>>Initial bitboard: >>>>>>11111110 >>>>>>11011100 >>>>>>10111010 >>>>>>10011000 >>>>>>01110110 >>>>>>01010100 >>>>>>00110010 >>>>>>00010000 >>>>>> >>>>>>Result of ReverseBitsPerByte(bitboard): >>>>>>01111111 >>>>>>00111011 >>>>>>01011101 >>>>>>00011000<--what the ^*&* goes wrong here? Should be 00011001. >>>>>>01101110 >>>>>>00101010 >>>>>>01001100 >>>>>>00001000 >>>>>> >>>>>>Thanks for any help or suggestions. >>>>> >>>><snip code> >>>> >>>>These are all basically the same algorithm. It's going to take a while to >>>>rearrange everything when you do it bit-by-bit. Sander has the right idea by >>>>first swapping bytes and then trying to reverse the bits within the bytes (8*2 >>>>ops + the swap as opposed to 64*2 ops). >>> >>>The algorithms don't perform identically. >>>They don't require 64 ops. >>>Once you get the "Sander" algorithm completed, compare the two (and also the >>>COBRA algorithm). >> >>Yes, breverse5 is a divide-and-conquer algorithm. The others all iterate 64 >>times. Assuming 2 ops/iteration was a gross underestimate as they use heavy >>masking and shifting. >> >>2 bswaps (and probably a mov, too) will reverse the bytes; afterward, the bits >>can be reversed as well. >> >>Not sure when or if I can have a look at COBRA and think about the algorithm -- >>I've got more than enough mandatory work right now. > >The COBRA algorithm is designed for big bit sequences. I don't know how well it >adapts to 64 bits. I have only just downloaded the paper. > >This algorithm: > >typedef unsigned long ling Bitboard; > >Bitboard breverse5(Bitboard n) >{ > int i = 64; > Bitboard m = -1; > while (i /= 2) > m ^= m << i, n = n >> i & m | (n & m) << i; > return n; >} > >iterates 6 times. I suspect it will be faster than something table driven. >Perhaps also faster than 8 mask and shift operations. > >I'll be surprised if anything can beat it by 20%. Here is the generated assembly language. It would look a lot prettier on a 64 bit chip, for sure. PUBLIC ?breverse5@@YI_K_K@Z ?breverse5@@YI_K_K@Z PROC NEAR .B4.1: ; 202 ; Preds .B4.0 ;;; { 00000 57 push edi ;C:\tmp\brev.cpp 00001 56 push esi ;C:\tmp\brev.cpp 00002 55 push ebp ;C:\tmp\brev.cpp 00003 53 push ebx ;C:\tmp\brev.cpp 00004 83 ec 14 sub esp, 20 ;C:\tmp\brev.cpp 00007 8b 54 24 28 mov edx, DWORD PTR [esp+40] ;C:\tmp\brev.cpp 0000b 8b 5c 24 2c mov ebx, DWORD PTR [esp+44] ;C:\tmp\brev.cpp ;;; int i = 64; ;;; Bitboard m = -1; ;;; while (i /= 2) ;;; { ;;; m ^= m << i, n = n >> i & m | (n & m) << i; ;;; iterations++; 0000f 8b 0d 00 00 00 00 mov ecx, DWORD PTR ?iterations@@4IA ;C:\tmp\brev.cpp 00015 89 4c 24 0c mov DWORD PTR [esp+12], ecx ;C:\tmp\brev.cpp 00019 89 54 24 08 mov DWORD PTR [esp+8], edx ;C:\tmp\brev.cpp 0001d b8 ff ff ff ff mov eax, -1 ;C:\tmp\brev.cpp 00022 89 44 24 04 mov DWORD PTR [esp+4], eax ;C:\tmp\brev.cpp 00026 be ff ff ff ff mov esi, -1 ;C:\tmp\brev.cpp 0002b bf 20 00 00 00 mov edi, 32 ;C:\tmp\brev.cpp ; LOE ebx esi edi .B4.2: ; 1212 ; Preds .B4.11 .B4.1 00030 8b 44 24 04 mov eax, DWORD PTR [esp+4] ;C:\tmp\brev.cpp 00034 8b d6 mov edx, esi ;C:\tmp\brev.cpp 00036 8b cf mov ecx, edi ;C:\tmp\brev.cpp 00038 e8 fc ff ff ff call __allshl ;C:\tmp\brev.cpp ; LOE eax ebx esi edi .B4.7: ; 1212 ; Preds .B4.2 0003d 31 44 24 04 xor DWORD PTR [esp+4], eax ;C:\tmp\brev.cpp 00041 33 f2 xor esi, edx ;C:\tmp\brev.cpp 00043 8b 44 24 08 mov eax, DWORD PTR [esp+8] ;C:\tmp\brev.cpp 00047 8b d3 mov edx, ebx ;C:\tmp\brev.cpp 00049 8b cf mov ecx, edi ;C:\tmp\brev.cpp 0004b e8 fc ff ff ff call __aullshr ;C:\tmp\brev.cpp ; LOE eax ebx esi edi .B4.9: ; 1212 ; Preds .B4.7 00050 8b ea mov ebp, edx ;C:\tmp\brev.cpp 00052 8b 54 24 04 mov edx, DWORD PTR [esp+4] ;C:\tmp\brev.cpp 00056 23 c2 and eax, edx ;C:\tmp\brev.cpp 00058 89 44 24 10 mov DWORD PTR [esp+16], eax ;C:\tmp\brev.cpp 0005c 8b 44 24 08 mov eax, DWORD PTR [esp+8] ;C:\tmp\brev.cpp 00060 23 ee and ebp, esi ;C:\tmp\brev.cpp 00062 23 c2 and eax, edx ;C:\tmp\brev.cpp 00064 23 de and ebx, esi ;C:\tmp\brev.cpp 00066 8b d3 mov edx, ebx ;C:\tmp\brev.cpp 00068 8b cf mov ecx, edi ;C:\tmp\brev.cpp 0006a e8 fc ff ff ff call __allshl ;C:\tmp\brev.cpp ; LOE eax ebp esi edi .B4.11: ; 1212 ; Preds .B4.9 0006f 83 44 24 0c 01 add DWORD PTR [esp+12], 1 ;C:\tmp\brev.cpp 00074 8b da mov ebx, edx ;C:\tmp\brev.cpp 00076 0b dd or ebx, ebp ;C:\tmp\brev.cpp 00078 0b 44 24 10 or eax, DWORD PTR [esp+16] ;C:\tmp\brev.cpp 0007c 89 44 24 08 mov DWORD PTR [esp+8], eax ;C:\tmp\brev.cpp 00080 81 c7 00 00 00 80 add edi, -2147483648 ;C:\tmp\brev.cpp 00086 81 d7 00 00 00 80 adc edi, -2147483648 ;C:\tmp\brev.cpp 0008c d1 ff sar edi, 1 ;C:\tmp\brev.cpp 0008e 85 ff test edi, edi ;C:\tmp\brev.cpp 00090 75 9e jne .B4.2 ; Prob 83% ;C:\tmp\brev.cpp ; LOE eax ebx esi edi al ah .B4.3: ; 202 ; Preds .B4.11 00092 8b 4c 24 0c mov ecx, DWORD PTR [esp+12] ; 00096 89 0d 00 00 00 00 mov DWORD PTR ?iterations@@4IA, ecx ;C:\tmp\brev.cpp 0009c 8b d3 mov edx, ebx ; ;;; } ;;; return n;
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.