Author: Bas Hamstra
Date: 12:08:17 01/21/03
Go up one level in this thread
On January 21, 2003 at 10:47:09, Gerd Isenberg wrote:
>On January 21, 2003 at 10:11:08, Bas Hamstra wrote:
>
>>On January 21, 2003 at 06:04:56, José Carlos wrote:
>>
>>>On January 21, 2003 at 05:46:37, Bas Hamstra wrote:
>>>
>>>>Hi Russell,
>>>>
>>>>Question: Why do you calculate KnightAttacks the way you do? Why not simply
>>>>
>>>> KnightAttacks(Sq) = KnightAttMask[Sq];
>>>>
>>>>One memory lookup in a small precalculated table KnAtt[64] should be much much
>>>>faster. For the rest I am interested to test the speed of the C-style fill
>>>>routines against rotated BB. Will be slower, but how much?
>>>>
>>>>Bas.
>>>
>>> It can be useful for any calculation you can do in parallel. For example, you
>>>might want to know which of your pieces are attacked by opponent light pieces.
>>>For simplicity, let's suppose you're only interested in knight attacks. Then
>>>you do:
>>>
>>> KinghtAttacks(MyPieces) & OpponentKnights
>>>
>>> If don't know if it is faster than extracting the bits from OpponentKnights,
>>>doing the table lookup and matching it against MyPieces, but it's simpler.
>>
>>No, the table lookup is practically FREE. Remember this is just a small table,
>>that basically remains in cache. The computation OTOH does a big number of
>>expensive shifts for nothing. Also I fail to see why it is simpler...
>>
>>Bas.
>
>Hi Bas,
>
>for one (or two) Knight it's obvious, that a lookup is superior.
>
>In Jose's sample you have to traverse (a bitscan-loop) over all MyPieces to "or"
>all the Knight-Attacks from every piece square via lookup.
>
>But in this case it is of course smarter to traverse over all opponent Knights
>(most often not more than two ;-) to or all attacks via lookup anding finally
>with own pieces.
>
>Anyway for some reasons (eg. progressive Knight mobility) it is nice to have a
>fill-based routine working with a set for multiple (virtual) knights.
>
>I think Russel's routine is faster to implement (even if it introduces few more
>dependencies), by first shifting one and two left/right with wrap anding, and
>then shifting one resp. two up and down, that saves four ands.
>With MMX you save the right wrap ands due to paddb instruction (packed add
>byte).
>
>Regards,
>Gerd
>
>--------------------------------------------------------
>
>// input: mm1 knightBB
>// output: mm0 knightAttacks
>//============================
>void getKnightAttacksMMX()
>{
> __asm
> {
> pxor mm4, mm4 ; 0x0000000000000000
> pcmpeqd mm7, mm7 ; 0xffffffffffffffff
> movq mm2, mm1 ; left
> psubb mm4, mm7 ; 0x0101010101010101
> psubb mm7, mm4 ; 0xfefefefefefefefe notA
> pand mm2, mm7 ; clear left a-file before shift
>
> paddb mm1, mm1 ; right
> psrlq mm2, 1 ; left
> movq mm3, mm1
> paddb mm1, mm1 ; 2right
> por mm3, mm2 ; left|right
> movq mm0, mm3 ; left|right
> pand mm2, mm7 ; clear left a-file before shift
> psrlq mm3, 16 ; left|right -> 2down
> psllq mm0, 16 ; left|right -> 2up
> psrlq mm2, 1 ; 2left
> por mm0, mm3
>
> movq mm3, mm1
> por mm3, mm2 ; 2left|2right
> movq mm2, mm3 ; 2left|2right
> psllq mm3, 8 ; 2left|2right -> up
> psrlq mm2, 8 ; 2left|2right -> down
> por mm0, mm3
> por mm0, mm2
> }
>}
I wonder how you call such a routine, you pass the parameter via a MMX register.
So before the routine is called you have to do some
asm lea REG, BB
asm movq mm1, [REG]
or something? Why no normal parameter passing?
Bas.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.