Author: Vincent Diepeveen
Date: 16:55:08 10/29/02
Go up one level in this thread
On October 29, 2002 at 18:51:53, Gerd Isenberg wrote:
><snip>
>>>>You could go for something cheap to detect checks in qsearch
>>>>and just see whether the 'to' square possibly attacks the king
>>>>square, only after that call a function.
>>>>
>>>>if( quickchecktable[piece][63+kingsq-to]
>>>> && slowcheck(piece,to,kingsq) )
>>>> check = true; ...
>>>
>>>In that case I may miss checks that are not done by the piece that moved.
>>
>>You are concerned about xray checks?
>>
>>In DIEP i do things like that, but i don't know many engines
>>which do checks in qsearch which do them. Note that i do
>>unlimited checks in the qsearch for like 32
>>ply in a row or so (my stackdepth of qsearch currently is 32 ply
>>i did see no reason yet to make that more).
>>
>>In any case, here is how i do things in qsearch, but i have to
>>warn you that it's very slow if you want to get a million
>>nodes a second :)
>>
>>Note that you see i'm using incremental attack tables too.
>>
>>For a number of years i could do without incremental attack tables
>>by simply only using them in evaluation a lot and quite a bit less
>>in move ordering.
>>
>>>I still have ways to improve the speed without missing checks.
>>
>>In case you want to do *all* checks, then you might want to use
>>attacktables too.
>>
>>Note that i simply generate all moves in qsearch. Total move generation
>>all together is like 0.6% system time anyway.
>>
>>Then i can do all moves from which i gamble that they are making my
>>search more quiet. With big evaluation not necessarily captures and
>>checks are the only moves that can modify evaluation a lot.
>>
>>>Note that I cannot use the difference in the squares to decide if a rook pseudo
>>>attack the king because the difference between h2 and a3 is the same as the
>>>difference between a3 and b3.
>>
>>but it's very fast to use such a small table that eliminates already the
>>vaste majority.
>>
>>But indeed i also got rid of that primitive table and use a bigger table
>>now. The primitive table works great however for programs that want to
>>get a million nodes a second and extend check evasions (so we do not
>>talk about qsearch here but about normal search where of course
>>calling a function to detect whether you are in check is way way too
>>slow).
>>
>>Here is some code from DIEP hoping you will have a look at it:
>>
>> /* Eerst controleren op schaak */
>> if( piece != king && looprichting[piece][t][OpKing]
>> && ScanAttack(side,piece,t,OpKing) ) {
>> nt->zet |= move_checks;
>> }
>>
>> /* Aftrekschaak */
>> if( (AttM[f]&(ctlB|ctlR|ctlQ)) ) {
>> if( ((AttM[f]&ctlB) && looprichting[bishop][f][OpKing]
>> && ScanXrayAttack(side,bishop,f,OpKing))
>> || ((AttM[f]&ctlR) && looprichting[rook][f][OpKing]
>> && ScanXrayAttack(side,rook,f,OpKing))
>> || ((AttM[f]&ctlQ) && looprichting[queen][f][OpKing]
>> && ScanXrayAttack(side,queen,f,OpKing)) ) {
>> int bb,bc,bd,be;
>> bb = board[t];
>> bc = color[t];
>> be = snelbord[f];
>> bd = snelbord[t];
>> snelbord[f] = 0;
>> snelbord[t] = 7;
>> board[f] = 0;
>> color[f] = neutral;
>> board[t] = 7; /* zodat schaakje niet van het stuk zelf is */
>> color[t] = xside;
>> if( SqAttackedBySide(OpKing,side) ) {
>> nt->zet |= move_checks;
>> aftrekschaak = 1;
>> /* #if ForwardPruning
>> nt->zet |= move_noprune;
>> #endif*/
>> }
>> board[f] = piece;
>> color[f] = side;
>> board[t] = bb;
>> color[t] = bc;
>> snelbord[f] = be;
>> snelbord[t] = bd;
>> }
>> }
>>
>>
>>
><snip>
>
>Hi Vincent,
>nice to see some code from DIEP, interesting.
>
>I guess the declaration for looprichting is something like this:
>
>bool looprichting[12][64][64]; // 48KB or 40KB for [10]?
int looprichting[7][64][64];
of course only elements 1..5 get used practical for this check
generation so it is:
5 * 4096 * 4 = 80KB
I do not believe in mixing char/bool/int in a program is an easy
way to make a program. It needs tuning for each new generation of
processors. Of course it could speedup code 5-10% depending upon
the processor.
But for writing good readable compatible code with not a single
partial register stall, obviously making everything 32 bits is
the best way to go.
I can imagine that because of the major speed loss, mixing 64
bits code with 32 bits code in Isichess is not so dumb.
Mixing 8 bits code + 32 bits code is something which is no good show
to students. It asks for suffering and trouble. the more code there is
the more likeliness for bugs there even.
>I do it with bitboards in a similar way, but use an array of function pointers
>for each piece. For sliding pieces i use also currently a two dimensional array
>(bitboards which makes 32KB) indexed by "to" and "king" to get a bitboard
>looking for some pieces inbetween.
This array which i use above when using 8 bits indexing is 10KB.
>Due to mememory latency, i am thinking about a short direction fill, to
>determine these "inter"-bitboards. May be an array of function pointers indexed
>by let say (8+to-king) with dedicated functions that do directions fills with
>1..7 iterations without any conditional jumps.
>
>For "Aftrekschaak" i use bitboards for pinned pieces and covered
>(remove?)checkers, I initialized simultaniously for both sides before
>move-generation.
I'm not a big fan of writing code for 2 sides. It's waste of time IMHO
to do that. Only when i see no other option in my evaluation i do it.
I do it nowhere in search though. I'm 100% sure that despite that
extra 'side' i reference, my code still is fast there. Fiddling with
bitboards and attacks there is hell slow i found out when i did some
tries.
The 64 bits code at todays 32 bits processors is always suffering
from abnormal penalties. I was for example very
amazed that the huge stupid code to get a bit out of the bitboards
at the K7 was faster than doing the 2 vector instructions in a row.
You clearly showed that here and i find it kind of *disgusting*
that such things get penalties without the average programmer
even *smelling* that there could be danger there.
Optimizing for a certain processor like Dieter and Frans and many
others (Ed!) always use(d) to do is really a fulltime job IMHO.
That 2 branches at the K7 which suffer for sure 1 misprediction,
that this still is faster than 2 vector instructions in a row
really convinces me that the x86 processors still have a long
way to go.
I am therefore really happy with the R14000/McKinley processors.
No weird behaviour on these things! Writing code as i do works
great for those processors! The R14000 has a L2 cache of 8MB and
the McKinley from 3 MB (L3), so obviously using such lookup arrays
is no problem there.
It must be disaster to use 64 bits at pc processors today, because
you need to test and retest everything you do with it. Also assumption
of crafty inline assembly that compilers put everything is in certain
registers IMHO is in itself a sick assumption. Doesn't that give the
compilers less possibilities to optimize the code in your opinion?
Such functions as get called here like in the above DIEP code, you
can do very fast in integer code with a minimum of mispredictions.
However as you can see it is not optimized for speed that much. As i already
indicate, if my program would be very small and like Quest a look
like assembly engines had to run within L1 cache, the array sizes (80KB)
are already too big to consider to be useful. DIEP is using that
L2 data cache very intensively. Hopefully it gets efficiently
prefetched also at the K7 MP processor.
In the board of course the gnuchess pawn codes
are there (pawn = 1, king = 6), so for pawns you need in the function
some quick code to determine check or not. that's a few
integer instructions as we all know. No need to fiddle with slow
64 bits there.
>I have to look whether "from" is member of "remove checkers" bitboard and
>whether the move leaves the ray.
Why do you use a lookup table for functions Gerd,
ain't there faster alternatives without defining a bunch of
functions?
In my code i already lose of course in advance the difference between
white and black. Then secondly i don't need to check for the queen
seperately as that's bishop + rook. So all i need is like 1 'if'
statement extra which saves out a lookup function table.
>Regards,
>Gerd
>
>
>// two polymorph routines
>Bool CNode::IsCheckMove(UINT to, UINT piece) const {
> return (this->*m_scIsCheckMove[piece])(to);}
>
>Bool CNode::IsCheckMove(UINT from, UINT to, UINT piece) const {
> return IsCheckMove(to, piece) || IsRemoveCheckMove(from, to);}
>
>....
>
>typedef Bool (CNode::*PTR_ISCHECKMOVE)(UINT to) const;
>
>CNode::PTR_ISCHECKMOVE CNode::m_scIsCheckMove[14] =
>{
> KingCheckMove,
> KingCheckMove,
> WPawnCheckMove,
> BPawnCheckMove,
> BishopCheckMove,
> BishopCheckMove,
> KnightCheckMove,
> KnightCheckMove,
> RookCheckMove,
> RookCheckMove,
> KingCheckMove,
> KingCheckMove,
> QueenCheckMove,
> QueenCheckMove,
>};
>
>Bool CNode::KingCheckMove(UINT to) const
>{
> to;
> ASSERT(false);
> return false;
>}
>
>
>Bool CNode::WPawnCheckMove(UINT to) const
>{
> UINT eksq = EnemyKingSquare();
> if ( isDiagonalAdjacent(eksq, to) )
> {
> register int delta = eksq - to;
> return (delta == 7) + (delta == 9);
> }
> if (to >= 56 )
> return QueenCheckMove(to);
> return false;
>}
>
>...
>
>Bool CNode::BishopCheckMove(UINT to) const
>{
> UINT eksq = EnemyKingSquare();
> if ( isDiagonalAdjacent(eksq, to) )
> return true;
> if ( isDiagonal(eksq, to) )
> return (m_sInterBB[eksq][to] & GetPieceBB()) == 0;
> return false;
>}
>
>...
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.