Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: R. Hyatt, Crafty style move generation

Author: Landon Rabern

Date: 23:39:12 01/24/01

Go up one level in this thread


On January 25, 2001 at 01:33:23, Larry Griffiths wrote:

>On January 24, 2001 at 23:08:47, Landon Rabern wrote:
>
>>On January 24, 2001 at 22:49:08, Landon Rabern wrote:
>>
>>>On January 24, 2001 at 03:43:03, Severi Salminen wrote:
>>>
>>>>>How much of a speed increase did you get from using MMX?  I got only about a 35%
>>>>>increase.
>>>>
>>>>"Only"??? Are you mad :) 35% speedup from using different set of instructions is
>>>>_a lot_!! Are you now fast or were you slow back then?
>>>>
>>>>Severi
>>>
>>>I said only because I was under the impression that Larry was getting a lot
>>>faster, but I guess not.
>>>
>>>I was always pretty fast and now faster.  Around 400,000 nps on a PIII-500 with
>>>my whole search and eval and everything on.  My eval is not that complicated
>>>right now and could use some work and my move ordering can be improved some.  I
>>>am at school now and do not have much time except to work on a Neural Network
>>>for my eval which I convinced them to give me credit for.
>>>
>>>The version of my program that is released 5.1 I think does not have the MMX
>>>stuff because I did not want to deal with making it compatible for above and
>>>below PII.
>>>
>>>Regards,
>>>
>>>Landon
>>
>>If anyone cares, here is the MMX code I wrote for it.  I would not doubt it if
>>tehre were some good ways to optimize it.
>>
>>Regards,
>>
>>Landon
>>
>>#define gBlackBishopQueenCaps(fs)    \
>>{																						\
>>        __asm mov eax,board\
>>	__asm movq mm0,[eax].A1H8board\
>>	__asm mov ebx,fs\
>>	__asm shl ebx,2\
>>	__asm movd mm1,[A1H8shift+ebx]\
>>	__asm psrlq mm0,mm1\
>>        __asm pand mm0,[A1H8rotMask+ebx]\
>>	__asm movd mm2,[A8H1shift+ebx]\
>>	__asm movq mm3,[eax].A8H1board\
>>	__asm psrlq mm3,mm2\
>>	__asm pand mm3,[A8H1rotMask+ebx]\
>>	__asm shl ebx,9\
>>	__asm movd ecx,mm0\
>>	__asm shl ecx,3\
>>	__asm movq mm4,[attacksA1H8+ebx+ecx]\
>>	__asm movd ecx,mm3\
>>	__asm shl ecx,3\
>>	__asm movq mm5,[attacksA8H1+ebx+ecx]\
>>	__asm por mm4,mm5\
>>	__asm pand mm4,whitePieces\
>>	__asm movq toMap,mm4\
>>}
>
>Looks simular to my code.  I have GenCaps, GenMoves and GenCapsMoves.
>The GenCapsMoves produces both a capture and a move bitboard.  I save the move
>bitboard in my derived piece object so that I can just grab the move bitboard
>when GenMoves is called after a GenCapture.
>
>Here is my Queen GenCapsMoves...
>
>#define	defbbGenSplitQueenCapturesMoves(fsq,bbrankocpieces)\
>	{\
>	_ESI = (unsigned long)fsq;\
>	asm	lea	EDI,bbArray1;\
>	asm	movzx	EAX,bbRankContentsOffset[ESI];\
>	asm	movzx	EBX,bbFileContentsOffset[ESI];\
>	asm	movzx	ECX,bbLDiagContentsOffset[ESI];\
>	asm	movzx	EDX,bbRDiagContentsOffset[ESI];\
>	asm	movzx	EAX,byte ptr [EDI+EAX+bbRankOccupied*8];\
>	asm	movzx	EBX,byte ptr [EDI+EBX+bbFileOccupied*8];\
>	asm	movzx	ECX,byte ptr [EDI+ECX+bbLDiagOccupied*8];\
>	asm	movzx	EDX,byte ptr [EDI+EDX+bbRDiagOccupied*8];\
>	asm	shl	ESI,0x0b;\
>	asm	movq	mm4,[EDI+bbRankOccupied*8];\
>	asm	movq	mm0,[bbRankAdjacentEmptySquaresPieces+ESI+EAX*8];\
>	asm	por	mm0,[bbFileAdjacentEmptySquaresPieces+ESI+EBX*8];\
>	asm	por	mm0,[bbLDiagAdjacentEmptySquaresPieces+ESI+ECX*8];\
>	asm	por	mm0,[bbRDiagAdjacentEmptySquaresPieces+ESI+EDX*8];\
>	asm	pandn	mm4,mm0;\
>	asm	pand	mm0,[EDI+bbrankocpieces*8];\
>	asm	movq	bbMoves,mm4;\
>	asm	movq	bbCaptures,mm0;\
>	}

What if you get a cutoff and do not need the moves.  I only gen caps then only
gen moves after I have tried all the caps, so that if I get a cutoff, the work
of generating moves is saved.

I also use the MMX registers for check detection which seems to speed that up a
little.

Have you tried coding the whole gereation routine in assembler.  Like the loop
to take moves out of the bitboard, etc.  if so, any improvement?


Regards,

Landon



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.