Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: 0x88 and move generator speed

Author: Larry Griffiths

Date: 19:15:50 02/01/01

Go up one level in this thread


On February 01, 2001 at 21:24:13, Vincent Diepeveen wrote:

>On February 01, 2001 at 07:12:07, Dan Newman wrote:
>
>>On February 01, 2001 at 03:36:08, David Rasmussen wrote:
>>
>>>On January 31, 2001 at 07:15:59, Dan Newman wrote:
>>>
>>>>On January 30, 2001 at 06:26:17, David Rasmussen wrote:
>>>>
>>>>It just means I don't use rotated bitboards.  I have a bitboard for each
>>>>different piece type (12 bitboards) + occupied square bitboards for each
>>>>color (2) + an occupied square bitboard (1).  This means make() and undo()
>>>>are a bit cheaper since I don't have to update rotated bitboards.  Also,
>>>>there are a few large data structures (arrays of bitboards) that aren't
>>>>needed as well--so fewer memory hits.
>>>>
>>>>I treat the non-sliding pieces more or less like any other bitboard program,
>>>>but for sliding pieces I occasionally resort to scanning the board.  I guess
>>>>it might be thought of as a sort of hybrid of bitboard and mailbox.
>>>>
>>>>I suspect (but don't have any hard data) that this is cheaper than doing
>>>>rotated bitboards, and it's also much simpler to implement.
>>>>
>>>>OTOH, I think Bas (Hamstra) may have switched to rotated bitboards and
>>>>found them to be faster...
>>>>
>>>>-Dan.
>>>
>>>OK, so essentially you don't have any smart way of calculating the attackboard
>>>of sliding pieces??
>>>
>>>I mean, the normal method of extracting the state of the file, rank or >diagonal,
>>>and using this to quickly calculate an attackboard, cannot be used in a
>>>non-rotated bitboard design.
>>>
>>>Isn't this just worse in all cases compared to rotated bitboards (except for >the
>>>simpler design maybe)?
>>
>>I calculate a "pseudo-attack" bitboard for sliders.  (That's a bitboard
>>that has a bit turned on for each enemy piece on which the slider is
>>bearing, ignoring any blocking pieces.)  Then I start extracting the
>>coordinates of the "victims".  For each such coordinate I test to see if
>>any pieces are blocking the attack using a blocking-mask bitboard which
>>has all the bits turned on for squares that lie between the attacker and
>>victim.  If no blockers are there, we have a real victim.
>>
>>I don't think this differs much in cost from rotated bitboards.  In
>>some cases it's probably faster (sparse board), in others it's probably
>>slower than rotated bitboards.  Even if rotated bitboards happen to be a
>>little faster generating captures, they will certainly slow down make()
>>and undo().
>>
>>I don't think that it's really easy to figure out which is better, either by
>>reason or by experiment.  You can't just count instructions executed--there
>>are memory/cache effects to be considered.  Also, the quality of the
>>implementations will vary, and that is likely to have an even bigger effect.
>>So I might add rotated bitboards and find it slows me down--only because of
>>poor design.
>>
>>-Dan.
>
>Could you post please your LastOne() function
>in assembly or if it is in C and also write down what
>processor you use?
>
>Of course only if it's not the same as crafty, if it is
>the same i don't see how you can ignore the macro that
>eats most system time of the entire generation!

Have a look at mine, Vincent.
label is just a parm so that I can use the code several times in my program.
Count is where the last moves were added to the move list.
bb1 is a bitboard of moves or captures.
output is where a capture or non-capture movelist starts.
I also have a MoveModel that has FromPieceType, FromSquare, and Flags filled in
before using this macro.
tcmovesize is defined as 16 which is the size of each of my movelist entries.
The code basically loads the movemodel into a MMX register, then uses the "bsf"
instruction to get a square number from a capture or move bitboard.  This is
or'ed with the MMX register containing the move model and then stored in the
move list.  I have a seperate extract for Pawns that adds a value to the
extracted square to get the "from" square.  The From and To squares are then
or'ed with the MoveModel and stored into a capture/move list.

Larry.



#define defbbExtractAndOrMoveModel(label,Count,bb1,output)\
	{\
	_ESI = (unsigned long)bb1;\
	_EDX = (unsigned long)output;\
	asm	mov	ECX,[Count];\
	asm	mov	EDI,dword ptr [ESI];\
	asm	shl	ECX,4;\
	asm	mov	ESI,dword ptr [ESI+4];\
	asm	movq	mm0,qword ptr [bbTCMoveModel];\
	asm	movq	mm2,qword ptr [bbTCMoveModel+8];\
	asm	label##loop1:;\
	asm	bsf	EAX,EDI;\
	asm	jz		label##loop2;\
	asm	btr	EDI,EAX;\
	asm	movd	mm1,EAX;\
	asm	psllq	mm1,32;\
	asm	por	mm1,mm0;\
	asm	movq	[EDX+ECX],mm1;\
	asm	movq	[EDX+ECX+8],mm2;\
	asm	add	ECX,tcmovesize;\
	asm	jmp	label##loop1;\
	asm	label##loop2:;\
	asm	bsf	EAX,ESI;\
	asm	jz		label##Done;\
	asm	btr	ESI,EAX;\
	asm	add	EAX,32;\
	asm	movd	mm1,EAX;\
	asm	psllq	mm1,32;\
	asm	por	mm1,mm0;\
	asm	movq	[EDX+ECX],mm1;\
	asm	movq	[EDX+ECX+8],mm2;\
	asm	add	ECX,tcmovesize;\
	asm	jmp	label##loop2;\
	asm	label##Done:;\
	asm	shr	ECX,4;\
	asm	mov	[Count],ECX;\
	}




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.