Author: Robert Hyatt
Date: 08:41:36 07/02/02
Go up one level in this thread
On July 02, 2002 at 09:57:42, Sune Fischer wrote: >On July 02, 2002 at 09:16:43, Robert Hyatt wrote: > >>On July 02, 2002 at 08:25:42, Sune Fischer wrote: >> >>>HI >>> >>>I've read that GNU chess used to have a precomputed list of moves, and it would >>>use a lookup in that list rather than generate the moves. >>> >>>I don't like that idea, but if instead one created a large table to contain >>>information needed when _making_ the moves, then that would probably yield a >>>good speedup. >>> >>>For instance if using rotated bitboards, then at every move one needs to update >>>the rotated bitboards which requires lookups in tables, masking and additional >>>hassle to remove a piece if the move is a capture. >>>Some of that is has got to be very cache inefficient. >>> >>>Now, with a precomputed makemove table, we could have these delta keys ready for >>>one single mask, and we need but one lookup to get all the required delta-keys. >>> >> >> >>Early Crafty versions did this, based on the description by Slate/Atkin of >>how they did the same thing. The problem is that for deep searches, you end >>up doing a lot of redundant calculations. IE if you search to a 12 ply >>nominal search depth, you can easily hit 25 plies with extensions and >>captures. That is 25 "move table" update operations, most of which over- >>write updates at previous plies. There comes a point where it is more >>efficient to do a bit of calculation where you need it, rather than trying >>to incrementally update something multiple times before it is needed... >> > >I agree, but I was not suggesting to do incremental update on something new (or >did I misread you?), only to speed up the incremental update on what you already >do. Crafty has this, which you _must_ do because you use it for move generation: > >MakePieceMove: > ClearRL90(from,OccupiedRL90); > ClearRL45(from,OccupiedRL45); > ClearRR45(from,OccupiedRR45); > SetRL90(to,OccupiedRL90); > SetRL45(to,OccupiedRL45); > SetRR45(to,OccupiedRR45); > >#define Clear(a,b) b=ClearMask(a)&(b) >#define ClearRL90(a,b) b=ClearMaskRL90(a)&(b) >#define ClearRL45(a,b) b=ClearMaskRL45(a)&(b) >#define ClearRR45(a,b) b=ClearMaskRR45(a)&(b) >#define Set(a,b) b=SetMask(a)|(b) >#define SetRL90(a,b) b=SetMaskRL90(a)|(b) >#define SetRL45(a,b) b=SetMaskRL45(a)|(b) >#define SetRR45(a,b) b=SetMaskRR45(a)|(b) > >#define SetMask(a) (set_mask[a]) >#define SetMaskRL90(a) (set_mask_rl90[a]) >#define SetMaskRL45(a) (set_mask_rl45[a]) >#define SetMaskRR45(a) (set_mask_rr45[a]) >#define ClearMask(a) (clear_mask[a]) >#define ClearMaskRL90(a) (clear_mask_rl90[a]) >#define ClearMaskRL45(a) (clear_mask_rl45[a]) >#define ClearMaskRR45(a) (clear_mask_rr45[a]) > >If I'm counting correctly here, that's 1 lookup plus 1-2 masks on both the from >and to square. >A table could do the same in 1 mask for both squares (and you have 4(!) of these >rotated boards), ok there is the pointer reference overhead, but still faster >than a cache miss on one of those table lookups I bet. Actually the current version doesn't quite do that any longer. It "ors" the from/to squares (rotated) together and then XORs that result with the appropriate rotated board. Saves a couple of operations... > Same procedure goes for >the HashKey (which BTW need 4 lookups if you capture a piece or castle), so its >really 5 variables that benefit, if you add a piecesquare its 6 variable updates >that have become 2-3 times faster, not bad - but 14 megs!? > >-S. The thing I found was that the cache issue was _the_ problem, along with the small memory pipe on the PCs... That is why I went to the scheme I now use, to avoid the memory references when possible so that cache doesn't get destroyed so badly...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.