Author: Stuart Cracraft
Date: 17:55:45 12/28/05
Deep Blue Artificial Intelligence 134 (2002) 57 Murray Campbell , A. Joseph Hoane Jr. , Feng-hsiung Hsu Page 2 Outline Introduction System overview The chess chip Software search Hardware search Parallel search Evaluation function Miscellaneous Conclusion References Page 3 Abstract Deep Blue is the chess machine that defeated then- reigning World Chess Champion Garry Kasparov in a six-game match in 1997. Factors that contributed to this success: a single-chip chess search engine a massively parallel system with multiple levels of parallelism a strong emphasis on search extensions a complex evaluation function effective use of a Grandmaster game database Page 4 1. Introduction The Deep Blue computer chess system, developed at IBM Research during the mid- 1990s. There are two distinct versions of Deep Blue, one which lost to Garry Kasparov in 1996 and the one which defeated him in 1997. 1996 version:Deep Blue I 1997 version:Deep Blue II We refer primarily to the 1997 version, and comparisons with the 1996 version. Page 5 Introduction(cont.) 1.1. ChipTest and Deep Thought These systems used a single-chip chess move generator [12] to achieve search speeds in the neighborhood of 500,000 positions per second (ChipTest) to 700,000 positions per second (Deep Thought). Deep Thought is described in detail in [16,17]. Page 6 Introduction(cont.) 1.2. Deep Thought 2 Also known as Deep Blue prototype Use the same move-generator chip as Deep Thought,but it had four improvements: Medium-scale multiprocessing. Have 24 chess engines Enhanced evaluation hardware. Evaluation hardware used larger RAMs and was able to include a few additional features in the evaluation function. Improved search software. Search software was designed to deal better with the parallel search, as well as implement a number of new search extension ideas. Extended book [6]. The extended book allowed Deep Thought 2 to make reasonable opening moves even in the absence of an opening book. Page 7 Introduction(cont.) 1.3. Deep Blue I Deep Blue I was based on a single-chip chess search engine Deep Blue I ran on a 36-node IBM RS/6000 SPTM computer, and used 216 chess chips. The chips each searched about 1.62 million chess positions per second. Overall search speed was 50100 million chess positions per second. Page 8 Introduction(cont.) 1.4. Deep Blue II First change, chess chip was designed Redesign evaluation function, from 6400 features to over 8000. Add hardware repetition detection, a number of specialized move generation modes Increase the per chip search speed to 22.5 million positions per second. Second change More than double the number of chess chips in the system Use the newer generation of SP computer to support the higher processing Page 9 Introduction(cont.) Third change Development of a set of software tools to aid in debugging and match preparation, eg., evaluation tuning and visualization tools. Finally Conclude the searchingability of Deep Blue was acceptable Spent the vast majority of our time between the two matches designing, testing, and tuning the new evaluation function Page 10 2. System overview Deep Blue is a massively parallel system designed for carrying out chess game tree searches. Compose of a 30-node (30-processor) IBMRS/6000 SP computer and 480 single-chip chess search engines, with 16 chess chips per SP processor. 28 nodes with 120 MHz P2SC processors, and 2 nodes with 135 MHz P2SC processors All nodes have 1 GB of RAM, and 4 GB of disk. The chess chips in Deep Blue are each capable of searching 2 to 2.5 million chess positions per second, and communicate with their host node via a microchannel bus. Page 11 System overview(cont.) Deep Blue is organized in three layers. One of the SP processors is designated as the master, and the remainder as workers. Master:searches the top levels of the chess game tree, and then distributes “leaf” positions to the workers for further examination Worker:carry out a few levels of additional search, and then distribute their leaf positions to the chess chips Chess chip:search the last few levels of the tree Page 12 System overview(cont.) Overall system speed varied widely 1996 version: Average about 100 million positions per second For quieter positions, speeds close to 200 million positions per second were typical. 1997 version: Average system speed observed in searches longer than one minute was 126 million positions per second. The maximum sustained speed observed in this match was 330 million positions per second. Page 13 System overview(cont.) Challenges Large searching capacity (Section 4 and 5) The search should be highly non-uniform. The search should provide “insurance” against simple errors. Hardware evaluation (Section 7) evaluation function is implemented in hardware. time to execute the evaluation function is a fixed constant. Massively parallel search (Section 6) over 500 processors available to participate in the game tree search Page 14 System overview(cont.) Hybrid software/hardware search The Deep Blue search combines a software search, with a hardware search. The software search is extremely flexible, and can be changed as needed. The hardware search is parameterized, but the general form of the search is fixed. Page 15 3. The chess chip The chess chip divides into three parts: the move generator, the evaluation function, and the search control. 3.1. Move generation The move generator is implemented as an 8 × 8 array of combinatorial logic, which is effectively a silicon chessboard. A hardwired finite state machine controls move generation. The move generator, although it generates the only one move at a time, implicitly computes all possible moves and selects one via an arbitration network. Computing all the moves simultaneously is one way to get minimum latency while generating moves in a reasonable order. Page 16 The chess chip(cont.) 3.2. Evaluation function The evaluation function is composed of a “fast evaluation” and a “slow evaluation” [7]. This is a standard technique to skip computing an expensive full evaluation when an approximation is good enough. Fast evaluation:computes a score for a chess position in a single clock cycle, contains all the easily computed major evaluation terms with high values. Slow evaluation:scan the chessboard one column at a time, computing values for chess concepts. Eg., king safety, pawn structure, passed pawns,…… The features recognized in both the slow and fast evaluation functions have programmable weights, allowing their relative importance to be easily adjusted. Page 17 The chess chip(cont.) 3.3. Search control The search control portion of the chip uses a number of state machines to implement null-window alpha-beta search. Advantage of null-window search Eliminates the need for a value stack Simplify the hardware design Disadvantage of null-window search some cases it is necessary to do multiple searches Limitation on the hardware search Lack of a transposition table Sol. Search in software move stack Page 18 The chess chip(cont.) 3.4. Extendability Use an external FPGA to provide : Access to an external transposition table More complicated search control Additional terms for the evaluation function Null move search was also explicitly supported by this method Page 19 4. Software search Based on the experiences with Deep Thought 1, a new selective search was built for Deep Thought 2 Called “dual credit with delayed extensions” Design principle Extend forcing/forced pairs (ffp’s) of moves Forced moves are expectation dependent Fractional extensions Delayed extensions Dual credit Preserve the search envelope Page 20 Software search(cont.) 4.1. Credit generation mechanisms ffp’s accumulate “credit”, and only when sufficient credit is available can it be “cashed in” for an extension. There is a large set of mechanisms to identify nodes that should receive credit. Singular, binary, trinary, etc. Absolute singular When there is only one legal move a large credit can be given with very little risk. Threat, mate threat. It is relatively simple using a null move search to detect if there is a threat in the current position If a large threat exists, such as a threat to checkmate, a higher level of credit can be given. Influence. This mechanism gives credit for moves which are enabled by previous moves. Domain dependent. Page 21 Software search(cont.) 4.2. Sample behavior For a given iteration i, the software is assigned i -4 ply, which represents the minimum depth search in software. The maximum software depth is approximately three times the minimum depth. The last column estimates the maximum depth reached in hardware and software combined. the estimate is based on results of simulating the hardware search Page 22 Software search(cont.) 4.3. Miscellaneous The Deep Blue scores are composed of two 16-bit signed integers regular search score and tie breaker score For a draw regular search score is zero tie breaker contains either the static evaluation of a theoretically drawn position or the count of moves until repetition “No progress” (a pruning mechanism) is implemented by detecting if the current position could have been reached by playing an alternate move at some earlier position on the search path. Page 23 5. Hardware search The hardware search takes place on the chess chip A chess chip carries out a fixed-depth null-window search, which includes a quiescence search. Fast and simple We limit the chess chips to carry out only shallow searches. The host processor controlling that chip is free to do other work The host can carry out a binary search, initiating a series of null-window searches to determine the value (some case, an exact value within a range is needed). Page 24 Hardware search(cont.) The main parameters of the hardware search Depth of search Depth of offset searches Endgame rules assertions off or on Endgame ROM assertions off or on Number of “mating” checks allowed for each side in the quiescence search Number of singular checking moves allowed in the quiescence search Flags to enable testing of singular, binary, or trinary conditions at the search root Flag to ignore stalemate at one ply above quiescence Flag to allow a one-ply extension in the quiescence search after a pawn moves to the 7th rank or, in some cases, pawn moves to the 6th rank Flag to allow a one-ply extension in the quiescence search when the side to move has multiple hung pieces or a piece that is pinned and hung Flag to allow a one-ply extension in the quiescence search when the side to move has one hung piece for consecutive plies Flag to allow a one-ply extension in the quiescence search if opponent has any hung pieces Page 25 6. Parallel search Deep Blue is composed of a 30-node RS/6000 SP computer and 480 chess chips The SP nodes communicate with each other using the MPI standard [10] The chess chips communicate with their host node via a Micro Channel bus Heterogeneous architecture has a strong influence on the parallel search algorithm used in Deep Blue Page 26 Parallel search(cont.) 6.1. Parallel search algorithm Processor hierarchy A static processor tree, with one SP node controlling the other 29 nodes, which in turn control 16 chess chips each The chess chips can only communicate directly with their host node Control distribution Centralized control of the parallel search The control is managed on the SP nodes Page 27 Parallel search(cont.) Parallelism possible Type 1 (PV) nodes After the first move has been examined at a PV node, all the alternatives may be examined in parallel Good type 2 nodes nodes where the first move “fails high”, or exceeds expectations Bad type 2 nodes nodes where the fail high move is not searched first Type 3 nodes nodes where all the moves fail low Synchronization Type 1 and type 2 nodes are synchronization points for Deep Blue The first move must be evaluated before parallelism is allowed Page 28 Parallel search(cont.) 6.2. Parallel search implementation Load balancing The search extensions algorithm leads to widely varying tree sizes for a given search depth This extends all the way to the hardware lead to severe load balancing problems Sol. abort long-running hardware searches and push more of the search into software Master overload Ensure that the workers always had a job “on deck”, ready to execute when it completes its active job Sharing between nodes Workers will generally pass their “expensive” transposition table results up to the master Page 29 Parallel search(cont.) 6.3. Parallel search performance It is difficult to assess the efficiency of the full 30- node Deep Blue system. Indirect evidence suggests an overall observed efficiency of about 8% in tactical positions and about12% in quiet positions. Page 30 7. Evaluation function 7.1. Overview Evaluation function is essentially a sum of feature values Static values are set once at the beginning of a search Dynamic values are also initialized at the beginning of a search, but during the search they are scaled The initialization is done by the “evaluation function generator” (sub-program run on the master node) Evaluation function generator is run only at the root of the search tree Evaluation function generator adjust feature values based on the context of the root position Page 31 Evaluation function(cont.) There are 54 registers and 8096 table entries for a total of 8150parameters that can be set in the Deep Blue evaluation function. ??????A Page 32 (nothing) Page 33 Evaluation function(cont.) 7.2. Automated evaluation function analysis The first tool had the goal of identifying features in the Deep Blue I evaluation function that were “noisy” relatively insensitive to the particular weights chosen The second tool was developed with the goal of tuning evaluation function weights Page 34 8. Miscellaneous 8.1. Opening book The opening book was created by hand The book consisted of about 4000 positions, and every position had been checked by Deep Blue in overnight runs The openings were chosen to emphasize positions that Deep Blue played well Page 35 Miscellaneous(cont.) 8.2. Extended book The extended book [6] is a mechanism that allows a large Grandmaster game database to influence and direct Deep Blue’s play in the absence of opening book information summarize the information available at each position and use the summary information to nudge Deep Blue in the consensus direction of chess opening theory Page 36 Miscellaneous(cont.) Factors go into the extended book evaluation function The number of times a move has been played Relative number of times a move has been played Strength of the players that play the moves. Recentness of the move Results of the move Commentary on the move Game moves versus commentary moves game moves are considered more reliable than commentary moves, and are thus likely to be better Developed an ad hoc function that combined these factors in a nonlinear way to produce a scalar value as output Page 37 Miscellaneous(cont.) 8.3. Endgame databases The endgame databases includes all chess positions with five or fewer pieces on the board, and selected positions with six pieces that included a pair of blocked pawns Endgames databases were used both off-line and on-line The off-line usage was during the design of the chess chips Each chip had a ROM which stored patterns to help evaluate certain frequently occurring chess endgames The software search used the databases in on-line mode Each of the 30 general purpose processors replicated the 4-piece and important 5-piece databases on their local disk Endgames were stored in the databases with one bit per position (indicating lose or does-not-lose) Page 38 Miscellaneous(cont.) 8.4. Time control Chess games typically have a set of requirements on the speed of play, termed the “time control” Before each search, two time targets are set normal time target set to be approximately the time remaining to the next time control divided by the moves remaining panic time target roughly one third of the remaining time Page 39 Miscellaneous(cont.) at the normal time target, a time-out occurs and the current best move is played The conditions go beyond its normal target into “panic time” The current best move has dropped 15 points or more from the score of the previous iteration The best move from the previous iteration is in a potential “fail-low” state A new move is in a potential “fail-high” state Page 40 9. Conclusion There were many areas where improvements could have been made Parallel search efficiency The hardware search and evaluation could have been made more efficient and flexible Evaluation function tuning, both automatic and manual, was far from complete The addition of pruning mechanisms to Deep Blue might have significantly improved the search Page 41 Summary of Deep Blue and predecessors (not included)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.