Author: Don Dailey
Date: 21:13:30 01/23/98
Go up one level in this thread
>I had a roughly 200 byte "state" of all the bitmaps, hash signatures, >castling/ep/50-move status and so forth. I was doing (in MakeMove) a >copy/update, so that all I had to do was back up a ply and I had no >unmake >to do. Copying 200 bytes per node is a definite *no no* on a PC. I >reduced >this over time, but the PC simply has no memory bandwidth to spare. IE >to >do a read on a P6/200 takes 20+ clock cycles, not to mention that >copying >large blocks like that blows out cache as well. And while you are >burning >those 20 cycles doing a memory read (only 8 bytes per 20 cycles >remember) >you can't fetch instructions or anything else that is not in cache. >the 6:1 was not the result of *just* copying state. There were other >changes to boot. I'd suspect that eliminating my make move approach and >doing copy/update rather than having to unmake later would cost at >*least* >30%. My mistake was running on a non-PC to start with... like the Cray. >It can copy like nobody's business. But in a PC, just converting your >chess board from chars to ints will make a very noticable slowdown, >because you start pumping 4x the data, using 4x the cache. I found that >on the PC, everything possible should be char to fit in the small cache. >And that it is critical to arrange data so that data needed at about the >same time is adjacent in memory so that a cache line fill will fetch >stuff >needed, not stuff that will not be used... Hi Bob, 30% sounds better. I think I can explain the difference. I am using about 160 bytes of state, but a lot of this the old make/unmake program also used. I did a bunch of stuff that I guess modern programs are no longer doing so the difference between my "position state" vs the "make/unmake" program are small. For instance here are some of the things my old program did: I would save several hash keys in a stack to avoid recalculating them when I unmade a move. I saved the move made itself so I would know how to unmake it. I saved the piece captured, and information to tell me how to unmake special moves if necessary. This is the kind of stuff in my current search state too. The depth of search I passed recursively from search call to call instead of updating a global variable. Instead of recomputing the tactical score incrementally for pawns and pieces I popped them off a stack. My variable that tells me if I'm in check was not a global variable, I did not want to recompute this when I unmade a move. At the time I had erroneously believed it was better to just pop this stuff off a stack when unmaking. I just assumed 1 memory accesses was better than 2 or 3 to incrementally restore a hash key for instance but I wasn't taking cache effects into consideration. So my new state passing program is actually not much different than the old program and maybe that's why its so cheap for my program. I should experiment with different setups to see how this affects things. Most of my state copies will probably be in the same locations a high percentage of the time if I have an efficient write cache it might no be so horrible. - Don
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.