Author: Tony Werten
Date: 00:43:27 04/18/04
Go up one level in this thread
On April 16, 2004 at 18:09:14, Andreas Guettinger wrote:
>On April 16, 2004 at 17:47:09, Gerd Isenberg wrote:
>
>>On April 16, 2004 at 17:14:14, Andreas Guettinger wrote:
>>
>>>On April 16, 2004 at 15:51:25, Gerd Isenberg wrote:
>>>
>>>>On April 16, 2004 at 06:14:25, Andreas Guettinger wrote:
>>>>
>>>>>Could somebody recap how the final macros look like, all inlined?
>>>>>What bitboards are needed to use the macros?
>>>>>Is the overhead of maintaining the bitmaps worth it?
>>>>>Is it a speedup to 0x88? Do you think its faster than a for loop checking for
>>>>>blocked squares between x and y?
>>>>>
>>>>>regards
>>>>>Andy
>>>>
>>>>Hi Andy,
>>>>
>>>>actually i would suggest something like this, bitboard return, to additionally
>>>>check whether the popcount is zero or one, probably looking for xrays, pins and
>>>>remove checkers.
>>>>
>>>>inline
>>>>BitBoard brq_path_clear(x88sqr gt, x88sqr ls, U64 BitBoard) {
>>>> return (isqBit(gt) - 2*isqBit(ls)) & path & all;
>>>>}
>>>>
>>>>BitBoard bstyle_attack(x88sqr x, x88sqr y) {
>>>> int dir = direction0x88Diff[0x88+x-y]); // 256 sized lookup
>>>> if ( dir & isDiagonalDirection )
>>>> {
>>>> BitBoard path = pathOfDirection[dir];
>>>> if ( x > y ) return brq_path_clear(x, y, path);
>>>> return brq_path_clear(y, x, path);
>>>> }
>>>> return (BitBoard) -1;
>>>>}
>>>>
>>>>Same for rooks, excecpt looking for straight rays.
>>>>
>>>>This is cache friendly, has two conditional jumps and some computations.
>>>>
>>>>Probably a 64*64 inbetween lookup with x and y is faster, but pollutes the cache
>>>>32KByte (+4/8/16Kbyte) a bit more.
>>>>One should try, saves some instructions and missed branches.
>>>>
>>>>BitBoard bstyle_attack(square64 x, square64 y) {
>>>> int dir = direction[x][y]; // 4K char(short/int lookup
>>>> if ( dir & isDiagonalDirection )
>>>> return inBetween[x][y] & all; // 32K lookup
>>>> return (BitBoard) -1;
>>>>}
>>>>
>>>>Or even faster and complete branchless with two proper initialized 32KByte
>>>>arrays (contains -1 for non common squares for diagonal and straight):
>>>>
>>>>// returns all if non common diagonals
>>>>BitBoard bstyle_attack(square64 x, square64 y) {
>>>> return inBetweenButOnlyDiagonals[x][y] & all;
>>>>}
>>>>
>>>>BitBoard rstyle_attack(square64 x, square64 y) {
>>>> return inBetweenButOnlyStraight[x][y] & all; // 32K lookup
>>>>}
>>>>
>>>>Ok, a tiny loop looking on the board is not that bad and very resource friendly.
>>>>But you have up to two conditions (target square reached, occupied)per run. And
>>>>the number of runs depends on the distance (up to six runs) and whether you got
>>>>an early break. In best case with two adjacent squares it is faster.
>>>>
>>>>So you may try it, it depends on so much "chaotical things" inside a concrete
>>>>chess program and of course on the target CPU, cache size etc...
>>>>
>>>>Cheers,
>>>>Gerd
>>>
>>>
>>>I currently have a loop that checks for inbetween fields.
>>>
>>>// can_reach returns true if a piece on fromSq can reach targetSq,
>>>// and no figure is blocking the path between fromSq and targetSq
>>>bool can_reach(const position_t *ppos, int fromSq, int targetSq)
>>>{
>>> int way = targetSq - fromSq + 128;
>>> int piece = fig_table[ppos->board[fromSq]->piece+6];
>>>
>>> if ((attackmove[way] & piece) == piece) // table lookup
>>> {
>>> fromSq += attackstep[way]; // table lookup for moving vector
>>> while (fromSq != targetSq)
>>> {
>>> if (ppos->board[fromSq] != NULL)
>>> return false; // path is blocked
>>>
>>> fromSq += attackstep[way];
>>> }
>>> return true;
>>> }
>>> return false;
>>>}
>>>
>>>
>>>Currently I get along with no bitboards at all. But maintaning one allPieces
>>>bitboard in makemove would not be very costly.
>>>I use the above function for generating attacks on the fly (in SEE), for inCheck
>>>function to check if any opponent piece can reach the king, but the function
>>>could even be used in genmove, generating recaptures etc.
>>>
>>>regards
>>>Andy
>>
>>
>>Yes, i even thought about that simple loop too.
>>
>>The while loop is really cheap, and may fit in 12 bytes or so.
>>
>>loop:
>> cmp [regBoard + regX], regEmpty
>> jnz false
>>start:
>> add regX, regInc
>> cmp regX, regY
>> jne loop
>>true:
>>...
>>false:
>>
>>I guess P4's trace cache is very happy about that.
>
>I don't know about P4 or AMD, but my PPC is happy with the function.
>
>>Don't know exactly whether AMD64 prefetches the same 16 bytes all the time.
>>If the branches are predicted correctly, foreward not taken and backward taken,
>>it might be hell fast, due to board is most likely a L1-hit, only 1-4
>>cachelines, always used. How expensive is the final miss prediction?
>
>Well, actually I forgot about the inline in front of the bool... I usually even
>type out the code, a for loop or while didn't make much difference for me. But
>inlineing the code brought me a full 50% speedup for perft (which is dominated
>by the inCheck() function in my case when profiling). For the incheck it would
>look something like:
>
>// if white to move
>for (int i = 0; i < ppos->bfigs; i++) // cycle all black pieces
>{
> fromSq = ppos->blackplist[i].location;
> way = kingSq - fromSq + 128;
> piece = fig_table[ppos->board[fromSq]->piece+6];
>
> if ((attackmove[way] & piece) == piece)
> {
> int cFrom = fromSq;
> for (;;)
> {
> cFrom += attackstep[way];
> if (cFrom == kingSq)
> return true;
> if (ppos->board[cFrom] != NULL)
> break; // path is blocked
You can take these together. ie if cFrom==kingsq then (ppos->board[cFrom] !=
NULL)
So you need only 1 if and when the board is not empty then you check for
kingsquare.
Tony
> }
> }
>}
>
>
>>But yes, most probably and i feel rather confident now for certain CPUs, you and
>>Christophe Theron are right about this boolean function for non bitboard based
>>programs. And it seems not to be the hottest spot to optimize...
>>
>>Cheers,
>>Gerd
>
>In contrast to perft, the inCheck function is much less important for normal
>chess. And for SEE the most critical code is too skip SEE whenever possible. :)
>
>regards
>Andy
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.