Author: Gerd Isenberg

Date: 13:45:51 02/22/06

Hi, some more elaboration and more detailed explanation on "branchless" sliding move generation. Hope it will at least be entertaining and eventually will motivate some to have a closer look to a Westcott-like fillbased bitboard framework. The idea was motivated by the well known paper: http://supertech.csail.mit.edu/papers/debruijn.pdf specially Chapter 4 Indexing two 1's prolog - move-indices ===================== The first thing was to save bitscans in my fillbased bitboard framework. Determing unique move-indices per direction by performing a multiplicative DeBruijn Hashfunction with "ored" isolated from- and to-bitboards. 224 or 140 possible unique from-to move combinations per straight and diagonal ray: 8*7 + 8*6 + 8*5 + 8*4 + 8*3 + 8*2 + 8*1 = 8*28 = 224 1*7 + 3*6 + 5*5 + 7*4 + 9*3 + 11*2 + 13*1 = 5*28 = 140 In Chapter 4, "Indexing two 1's" the authors of the above paper introduce a hashfunction to map all 64-bit numbers with at most two 1's (64*63/2+64+1 == 2081) into a table with 32706 entries: h(x) ::= (x * 0xE50FA91BE3A25401) >> (64-15); What, if the both 1's were dependent, such as move from-to bits of a sliding piece? Should be a much smaller range - but how much? Some time ago i wrote a recursive deBruijn generator to find factors to map all possible from-to-bitboards. Each time a deBruijn constant was found at the leaf of the De Bruijn tree, one or more unambiguously test for all enumerated from-to sets per direction were applied to look for collisions. I tried modified (some add and xor) DeBruins, ored "fromto" with distinct directions, as well as "from" minus "to" with combined opposite directions and a lot more. The most compact lookup-arrays was with distinct directions. With a lot of fitting DeBruijn and modified DeBruijn constants. The 140 diagonals maps into a 1/2K range, the 240 straights into 1K: strMoveIdx ::= ((strFrom|strTo)*DeBruijn) >> (64-10); diaMoveIdx ::= ((diaFrom|diaTo)*DeBruijn) >> (64-9); I intend to live with cute moveindices, without further lookup except for straight sliding moves. DoMove is basically a 3*16 byte-xor and a 16-byte and/add - but that is another story... The movelist idea ================= Recently i got the idea to map not only single moves but movelists of 0..7 moves, determined a sliding attack ray set. "Surprisingly" at the first glance, the number of possible none empty ray attack or target sets is exactly the number of moves per ray. But the number of bits vary from 1..7 (or even 0..7) considering empty sets. This 140(+1) or 224(+1) unique move target bitboards map to a 9 or 10 bit index range as well - fitting DeBruijns are a bit more rarely as for the pure move indices. strMoveListIndex ::= (strAttacks*DeBruijn) >> (64-10); diaMoveListIndex ::= (diaAttacks*DeBruijn) >> (64-9); Example with north-east direction with "little endian" mapping - white bishop on b2, black piece on f6 as "blocking" piece: ( h8=63 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 a1=0 h1=7 * 0x0219c7bf2a4dae8b ) >> (64-9) ==> 229 (the 55. from 140 unique indices in a 2**9 == 512 range) Now, array slot 229 refers the precalculated fixed sized movelist of eight 32-bit integers, up to seven moves and number of moves (always starts with greatest distance as this is eventually a capture target): ptr[229] => {b2f6, b2e5, b2d4, b2c3, null, null, null, 4} Now sliding move generation becomes rather loop-, branch- and bitscan-less: // e.g. kogge stone fill with single bishop BB noeast = getNorthEastAttacks(bishop); BB targets = noeast & ~ownPieces; idx = (targets * 0x0219c7bf2a4dae8b) >> (64-9); ptr = NorthEastBishopMoves + NorthEastIdx[idx]; // unconditional move of 7(8) ints movelist[x+0] = ptr->i[0]; movelist[x+1] = ptr->i[1]; movelist[x+2] = ptr->i[2]; movelist[x+3] = ptr->i[3]; movelist[x+4] = ptr->i[4]; movelist[x+5] = ptr->i[5]; movelist[x+6] = ptr->i[6]; // first is possible capture target uint piece2cap = board[movelist[x+0].to]; // captured piece or empty // modify if capture move movelist[x+0] |= captureFlags[piece2cap]; // increment index from 0..7 x += pt Unconditional moving seven ints avoids branches and may be done by a special memcpy with simd-wise with 128-bit SSE2-intrinsics with two aligned movdqa, while writing unaligned with two movdqu to the movelist. The precalculated moves inside a movelist may contain some static move score, based on pure geometrical center distance relation as a rough estimate for some movesorting. For a pure capture search, only one move and increment by max(1,ptr[idx]->i[7]) is necessary. So far so good - similar is possible with rotated. First one may triple the rotated lookup-table sizes by introducing not only combined opposite direction attacks but also the two disjoint. Second on may also index precalculated movelists with occupied state and square index - even if handling captures or whether attacked squares ar occupied by own pieces becomes a bit more complicated. Keeping direction wise disjoint attacks has some merits - also for determing pinned pieces, remove checker x-rays etc. more sophisticated idea and further brainstorming ================================================= I like to determine checking moves during generation in advance. May be it is not so important for move ordering (trying checks earlier might produce bigger trees), but to eventually avoid possible futility pruning near or at the tips. Imagine the above north-east set of a white b2 bishop and the black king on b8 so that the white bishop may check on e5 (x): h8=63 0 k 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 x 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 0 0 0 0 a1=0 h1=7 I have all bishop-like (as well as rook-like) attacks from the opposite king (all 4 diagonal directions combined) - a set of possible check targets for bishops and queens . If the intersection of the disjoint direction attack set with the appropriate checkset is not zero, there is a one square intersection - otherwise we could capture the king. I was thinking about how to squeeze this possible one-bit set, a check target square, to the original attack set but still keeping unique sets with a perfect hash-function. idx = (modifiedTargets * deBruijn) >> (64-nbitsneeded); with the check tag already set for B/Q b2e5: ptr[idx] => {b2f6, b2e5+, b2d4, b2c3, null, null, null, 4} or alternative, if check moves were already generated in an earlier pass: ptr[idx] => {b2f6, b2d4, b2c3, null, null, null, null, 3} How to modify targets and how many bits are needed for a most compact value range? The number of 224 distinct rook targets per direction becomes 4 times bigger - 896 with up to one check-target per set. The number of 140 bishop targets becomes only 3.4 times bigger - 476 with up to one check-target per set! Of course one couldn't modify target by xoring checkset, because sets become obviously ambiguous if outer bits are affected. I thought about oring with some left and some right shift of the checkset and that seems to work - no matter how small and big the check square index is (x<<left)|(x>>right) always becomes none zero if x is power of two and left+right < 64: modifiedTargets ::= targets | (x<<left) | (x>>right); I changed the deBruijnFound function of my De Bruijn generator with two left/right shift amount loops, to supply the unambiguously test function not only with the De Bruijn constant and number of bits of the range, but also with left/right shift amounts for the checksets. The diagonal range grows from 9-bit to 12-bit - 8-times bigger, 4K instead of 1/2K to map 476 entries instead of 140 - some diagonals deBruijn: dir #bits min max # << >> deBruijn NoEast 12 184 4049 476 33 2 0x021934563a97e7b7 SoEast 12 27 4060 476 34 3 0x0219b1f92ef456a7 SoWest 12 225 4081 476 25 7 0x021937151d2fcf6b NoWest 12 32 3961 476 1 26 0x021977e71eb4549b Unfortunately the straight range seem to grow from 10-bit to 14-bit - 16-times bigger, 16K instead of 1K to map 896 instead of 224. So sliding move generation with check information may look that way: // e.g. kogge stone fill with single bishop BB noeast = getNorthEastAttacks(bishop); BB targets = neast & ~ownPieces; // if ( targets ) BB checkset = targets & diagonalCheckTargets; targets |= (checkset<<33) | (checkset>>2); uint idx = (targets * 0x021934563a97e7b7) >> (64-12); ptr = NorthEastBishopMovesC + NorthEastIdxPlusC[idx]; // unconditional move of 7(8) ints movelist[x+0] = ptr->i[0]; movelist[x+1] = ptr->i[1]; movelist[x+2] = ptr->i[2]; movelist[x+3] = ptr->i[3]; movelist[x+4] = ptr->i[4]; movelist[x+5] = ptr->i[5]; movelist[x+6] = ptr->i[6]; // first is possible capture target uint piece2cap = board[movelist[x+0].to]; // captured piece or empty // modify if capture move movelist[x+0] |= captureFlags[piece2cap]; // increment index from 0..7 x += ptr[idx]->i[7]; // += n Still brainstorming - and corious about my first perft counts. Cheers, Gerd

- additional hint for rotated
**Gerd Isenberg***14:50:07 02/22/06*

This page took 0.03 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.