Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Resources about rotated bitboards

Author: Robert Hyatt

Date: 06:54:59 01/16/04

Go up one level in this thread


On January 16, 2004 at 00:29:28, Russell Reagan wrote:

>On January 15, 2004 at 11:01:46, Robert Hyatt wrote:
>
>>I am not sure that my current
>>implementation is "optimal" either.  So even if I did a comparison, I'm not
>>sure it would be valid.
>
>Tim Foden showed an interesting method that was very cache friendly. Something
>like this.
>
>static BYTE s_patMoves[8][256]; // 2.00K [R/F][pattern]
>static UINT64 s_extendVert[256]; // 2.00K [pattern]
>static UINT64 s_extendHorz[256]; // 2.00K [pattern]
>
>static const UINT64 s_rankMask[8]; // 0.06K [file]
>static const UINT64 s_fileMask[8]; // 0.06K [rank]
>static const UINT64 s_diagAMask[15]; // 0.12K [diagA]
>static const UINT64 s_diagHMask[15]; // 0.12K [diagH]
>
>inline int Rank( Sq sq ) { return sq >> 3; }
>inline int File( Sq sq ) { return sq & 0x7; }
>
>inline BYTE PatMoves( int rf, int pat )
>{ return CStatics::s_patMoves[rf][pat]; }
>inline UINT64 ExtVert( int pat ) { return CStatics::s_extendVert[pat]; }
>inline UINT64 ExtHorz( int pat ) { return CStatics::s_extendHorz[pat]; }
>
>inline UINT64 RankMask( int r ) { return CStatics::s_rankMask[r]; }
>inline UINT64 FileMask( int f ) { return CStatics::s_fileMask[f]; }
>
>inline UINT64 CBoard::GenRAttacksBitmap( Sq sq ) const
>{
>    int r = Rank(sq);
>    int f = File(sq);
>    BYTE movesR = PatMoves(f, m_rot[cHorz].bytes[r]);
>    BYTE movesF = PatMoves(r, m_rot[cVert].bytes[f]);
>    UINT64 bitmap = (ExtVert(movesR) & RankMask(r)) |
>        (ExtHorz(movesF) & FileMask(f));
>    return bitmap;
>}
>
>inline int DiagA( int rank, int file ) { return 7 + rank - file; }
>inline int DiagA( Sq sq ) { return DiagA(Rank(sq), File(sq)); }
>inline int DiagH( int rank, int file ) { return rank + file; }
>inline int DiagH( Sq sq ) { return DiagH(Rank(sq), File(sq)); }
>
>inline UINT64 DiagAMask( int a ) { return CStatics::s_diagAMask[a]; }
>inline UINT64 DiagHMask( int h ) { return CStatics::s_diagHMask[h]; }
>
>inline UINT64 CBoard::GenBAttacksBitmap( Sq sq ) const
>{
>    int r = Rank(sq);
>    int f = File(sq);
>    int a = DiagA(r, f);
>    int h = DiagH(r, f);
>    BYTE movesA = PatMoves(f, m_rot[cDiagA].bytes[a & 7]);
>    BYTE movesH = PatMoves(f, m_rot[cDiagH].bytes[h & 7]);
>    UINT64 bitmap = (ExtVert(movesA) & DiagAMask(a)) |
>        (ExtVert(movesH) & DiagHMask(h));
>    return bitmap;
>}
>
>I'm not sure if this is necessarily faster (especially if you have a big cache),
>but it seems like a novel approach. Maybe it will give you some new ideas, if
>you hadn't seen it already.

It's already been done, and it is actually a pretty obvious idea, but it
sucks badly on some architectures.  The Cray for example, doesn't address
bytes of memory, only 64 bit words.  Chars become ugly, and you do as much
trying to extract the char as you do piecing the attacks together.  I have
tried that in a test (not a complete deal in Crafty) and I didn't particularly
see any advantage.  My current code uses 4 arrays of 64x64x8 which is 128K
total, a fraction of current cache on 64 bit processors that tyically start at
1M and go up.

>
>Then there are the reversed bitboards which take advantage of the x^(x-2) trick
>that Gerd uses. The only bad thing about those is that you end up having a
>reversed bitboard for each direction (as opposed to three rotated bitboards), so
>you have significantly more to update. One thing that I kind of like about it is
>that you can generate attacks in single directions (as opposed to generating
>rook attacks in all four directions), which might be useful for some things. The
>rotated bitboards are pretty fast, and doing x^(x-2) four or eight times might
>be slower than the lookup tables used by the rotated approach.

The rotated bitmaps clearly produce the least work.  The issue becomes the
memory bandwidth they need if cache is not big enough.  That's why the
COMPACT_ATTACKS stuff was done years ago, because back then cache was very
small and 128kb would blow at least the PC and Sparc cache pretty badly.  (Of
course, back then the attack tables were also 512kb which really blew caches
of 1995).  The 128K today is a pretty good compromise.  One extra AND operation
per load while reducing the table size to 1/4 its original size.

>
>And of course, there are the filler routines that Gerd uses. That requires a
>_whole_ new way of thinking about a chess program. It takes a person a while to
>"think bitboards", and then you start from scratch again trying to figure out
>how to get those fillers to work efficiently. I tried it for a while, and...I
>have a new appreciation for Gerd's accomplishments :)


This is a whole new field, starting with bitmaps and working outward.  I
doubt it is "over" yet either. :)

Except for those that say "bitmaps suck" of course... :)



>
>Basically my point is, there is probably still nontrivial room for improvement
>with all of these very different approaches to bit twiddling.

I'm absolutely _certain_ there is much left to learn...




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.