Author: Ian Osgood
Date: 17:45:45 09/19/02
Go up one level in this thread
On September 19, 2002 at 17:40:18, Dann Corbit wrote: >How to get the speed is the key thing, though. You might add hash tables to >TSCP and it will double the speed. You might add null move to TSCP, again, >probably doubling the speed. You might fix the iteration in eval.c to use a >move list instead of iteration over the whole board. If you go to individual >piece lists, it will be about 4x faster in the early game, and 10x faster in the >endgame. Combine them all together and it will be quite a speedup. > >Also, you should profile to find where the program bottlenecks are. Sometimes >it is not the routine that takes the most time that is the bottleneck, because a >function can act like a gate. The Intel profiler has a really nice hot spot >locator. I have been writing a chess program in Forth which was originally based on TSCP 1.73. The improvements I implemented are roughly in order of priority, since I focused first on hotspots based on line profiling. The following measures are based on a 6 ply search from the initial position. The search rate is pretty slow, since I am using a non-optimized public domain Forth. Original port, already different from TSCP by using 0x88 instead of mailbox board representation, and a function vector for piece evaluation instead of a big case statement. 208794 nodes / 77.586 seconds 2691 nps Track king positions incrementally for check detection (TSCP searches the board for the king every time) 3997 nps, 52.24 seconds (48% speedup) Material and pawn files tracked incrementally for eval (TSCP totals the material and builds pawn structure data-structures from scratch every time eval is called). +/- 1/2 pawn value aspiration window (TSCP always uses +/- mate score) 188262 nodes / 44.577 seconds (17% speedup) 4223 nps ( 6% faster ) Enhance the attack detection with table lookup using a property of the 0x88 board representation (TSCP uses an iterative method). Fix a bug which ordered illegal castle moves as captures. Inline hottest small functions. 135335 nodes / 24.796 seconds (80% speedup) 5455 nps (29% faster) Function vector and loop unrolling for move generation (TSCP uses iteration through a number of tables). 135256 nodes / 22.592 seconds (10% speedup) 5984 nps More code inlining. 6786 nps, 19.929 seconds (13% speedup) Loop unrolling for board iteration ( move gen, eval) 7234 nps, 18.697 seconds (7% speedup) Expression optimization, condition removal Optimized alpha-beta for a stack machine. Fix some search bugs. 130144 nodes / 16.113 seconds (16% speedup) 8076 nps (12% faster) Attack detection and eval optimization. 8416 nps, 15.463 seconds (4% speedup) Move gen optimization. 8769 nps, 14.841 seconds (4% speedup) Smaller history table (piece*destination vs. source*destination) 109812 nodes, 12.628 seconds (18% speedup) Killer moves (not in TSCP) 113461 nodes, 12.928 seconds (15% speedup) Smaller history+killers 103468 nodes, 12.017 seconds (24% speedup) Including killers from 2 plies back 100800 nodes, 11.847 seconds (25% speedup) More efficient move sorting using function vectors (sorts captures before killers, better in the middlegame than the opening) 106877 nodes, 12.157 seconds (22% speedup) 8784 nps Null move heuristic (not in TSCP) Better check extension (checking move vs. out-of-check in TSCP) 28032 nodes, 3.225 seconds (277% speedup!) 8665 nps Conclusions (not language specific): Good things to add to a chess program as simple as TSCP (in order): 1. The null-move heuristic. it's da bomb 2. Track king positions for check detection (if needed) 3. Optimize attack (check) detection, especially if it uses iteration. The inner loop of this is the hottest spot in TSCP. 4. Use checking-move instead of out-of-check extension (to avoid entering the quiescence search with check). 5. Use the killer move heuristic for move ordering. 6. Experiment with different history tables (I had never heard of indexing by piece*destination until recently.) 7. Use an aspiration window for your alpha-beta search. 8. Standard optimization stuff: unroll constant inner loops, inline or macros for small frequently called functions, avoid conditionals, prefer table lookups, cache-and-reuse calculations Looking back, I got some of my biggest speedups by fixing bugs. So: 9. Write correct code :) I haven't found any of the other common search extensions (single response to check, mate threat, passed pawn advance, recapture) particularly compelling yet. ( Transposition tables and an opening book are of course good things to add to any chess program, but I personally haven't gotten to that yet. I'm still trying to keep memory usage small in the hope to port this someday to the F25 asynchronous Forth multiprocessor [2 ps per instruction, 25 CPUs per die]. ) My program is at http:/www.ultratechnology.com/chess.html . In fact, it qualifies as a small chess program. There are about 1800 lines of source with lots of comments. It compiles to about 55K of threaded code and data in the dictionary (about the same size as the source file). Obfusticated, the source would probably fit in 10K. Ian
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.