Author: Alexander Kure
Date: 03:11:48 11/16/99
Go up one level in this thread
On November 16, 1999 at 03:47:16, Dan Newman wrote:
[snipped]
>I'm getting 8.1 million pseudo-legal moves generated per second on
>Vincent's position on a P6/200 with my latest move generator--a
>non-rotated bitboard move generator. This move generator does scan
>the board for slider move generation, so I guess it's really a sort
>of hybrid. (My previous (0x88) move generator goes at about 5.8 Mm/s.)
>I suppose that I might get 18 Mm/s on a 450 MHz processor, but I
>haven't tried yet. The difference in cache could make a difference, so
>multiplying by 2.25 probably isn't quite right...
>
>Mostly what I do (to get higher performance) is to unroll loops, do
>as little if-testing as I can get away with, and (attempt to) reduce
>the number of memory accesses. Many of these things tend to trade code
>compactness for speed, so there is a point at which the return can be
>negative.
>
>In my 0x88 move generator I use piece lists so that I don't have to
>scan the board for the STM's pieces and a switch statement to
>handle move generation for different piece types rather than having
>arrays of offset values and so forth indexed by piece type or direction.
>This means the move generator isn't very compact, but it avoids doing
>array lookups for offset values by directly embedding them in the code:
>
> while( *piece ) {
> switch( piece_type( *piece) ) {
> case PAWN:
> // move gen code for pawn...
> break;
> ...
> case BISHOP: {
> int from = piece_coord( *piece);
> // NE dir
> int to = from + 17;
> while( (to & 0x88) == 0 ) {
> if( board[to] ) break;
> *movep++ = from | (to << 7);
> to += 17
> }
> // NW dir
> // etc...
> break;
> }
> ...
> case KING:
> // move gen code for king
> break;
> }
> piece++;
> }
>
A few months ago I implemented a movegenerator for 0x88 board and tried to do
some performance improvements so I am glad to see this thread starting ;-)
My pseudo move-generator app. generates 3 MNodes on a Pentium MMX 233 Mhz in
Vincent's position and is app. 2.5 times faster on a Pentium III 550 Mhz giving
7.5 MNodes (without sorting and assigning scores for a move). Far away from the
high score, I'm afraid ;-)
Unfortunately I can not show up with the code, cause I am at work right now.
Maybe later if requested.
But I doubt if using a switch instead of indexing through a piece array gives
any performance improvement. But directly proccessing each direction (e.g. NE,
NW, SE, SW for Bishop) for a piece seperately instead of looping and indexing a
piece dependent direction array surly yields some performance improvement. While
this lacks elegance it improves speed.
I will check this out!
What also gives some performance (app. 7% in my case) is to use the good old int
type instead of any combination of unsigned, short, and whatsoever.
>Another thing I do is have completely separate capture and non-capture
>move generators. This actually hurts on Vincent's test since I have the
>overhead of calling two generators and must do many things twice, but in
>actual use the capture generator is called far more often than the
>non-capture generator. This is because I use only the capture generator
>in the quiescence search, and in the full width search a capture quite
>often is the cause of a cutoff preventing the non-capture generator from
>getting called... So, it likely saves some cycles to separate the move
>generation like this. It also makes move ordering easier.
>
>-Dan.
Sounds reasonable to me.
Thx for your insights.
Greetings
Alex
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.