Author: Alexander Kure
Date: 03:11:48 11/16/99
Go up one level in this thread
On November 16, 1999 at 03:47:16, Dan Newman wrote: [snipped] >I'm getting 8.1 million pseudo-legal moves generated per second on >Vincent's position on a P6/200 with my latest move generator--a >non-rotated bitboard move generator. This move generator does scan >the board for slider move generation, so I guess it's really a sort >of hybrid. (My previous (0x88) move generator goes at about 5.8 Mm/s.) >I suppose that I might get 18 Mm/s on a 450 MHz processor, but I >haven't tried yet. The difference in cache could make a difference, so >multiplying by 2.25 probably isn't quite right... > >Mostly what I do (to get higher performance) is to unroll loops, do >as little if-testing as I can get away with, and (attempt to) reduce >the number of memory accesses. Many of these things tend to trade code >compactness for speed, so there is a point at which the return can be >negative. > >In my 0x88 move generator I use piece lists so that I don't have to >scan the board for the STM's pieces and a switch statement to >handle move generation for different piece types rather than having >arrays of offset values and so forth indexed by piece type or direction. >This means the move generator isn't very compact, but it avoids doing >array lookups for offset values by directly embedding them in the code: > > while( *piece ) { > switch( piece_type( *piece) ) { > case PAWN: > // move gen code for pawn... > break; > ... > case BISHOP: { > int from = piece_coord( *piece); > // NE dir > int to = from + 17; > while( (to & 0x88) == 0 ) { > if( board[to] ) break; > *movep++ = from | (to << 7); > to += 17 > } > // NW dir > // etc... > break; > } > ... > case KING: > // move gen code for king > break; > } > piece++; > } > A few months ago I implemented a movegenerator for 0x88 board and tried to do some performance improvements so I am glad to see this thread starting ;-) My pseudo move-generator app. generates 3 MNodes on a Pentium MMX 233 Mhz in Vincent's position and is app. 2.5 times faster on a Pentium III 550 Mhz giving 7.5 MNodes (without sorting and assigning scores for a move). Far away from the high score, I'm afraid ;-) Unfortunately I can not show up with the code, cause I am at work right now. Maybe later if requested. But I doubt if using a switch instead of indexing through a piece array gives any performance improvement. But directly proccessing each direction (e.g. NE, NW, SE, SW for Bishop) for a piece seperately instead of looping and indexing a piece dependent direction array surly yields some performance improvement. While this lacks elegance it improves speed. I will check this out! What also gives some performance (app. 7% in my case) is to use the good old int type instead of any combination of unsigned, short, and whatsoever. >Another thing I do is have completely separate capture and non-capture >move generators. This actually hurts on Vincent's test since I have the >overhead of calling two generators and must do many things twice, but in >actual use the capture generator is called far more often than the >non-capture generator. This is because I use only the capture generator >in the quiescence search, and in the full width search a capture quite >often is the cause of a cutoff preventing the non-capture generator from >getting called... So, it likely saves some cycles to separate the move >generation like this. It also makes move ordering easier. > >-Dan. Sounds reasonable to me. Thx for your insights. Greetings Alex
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.