Author: Sune Fischer
Date: 11:18:10 07/02/02
Go up one level in this thread
On July 02, 2002 at 13:45:42, Robert Hyatt wrote: >On July 02, 2002 at 12:14:02, Sune Fischer wrote: > >>On July 02, 2002 at 11:41:36, Robert Hyatt wrote: >> >>>Actually the current version doesn't quite do that any longer. It "ors" >>>the from/to squares (rotated) together and then XORs that result with the >>>appropriate rotated board. Saves a couple of operations... >> >>Where can I find this code? >>What I copied was from 18.15 in make.c from the MakeMove(). > >It is a change that is in the current version. I noticed that this could >be speeded up a bit a couple of weeks ago... IE notice how I update the >non-rotated boards with one XOR to remove the piece from where it was and >set the bit where it moves to. I now do this for the rotated bitmaps as >well. Made it a few percent faster... Wow, if you get "a few percent" just by doing that, then imagine what this could do ;) > > >> >>>> Same procedure goes for >>>>the HashKey (which BTW need 4 lookups if you capture a piece or castle), so its >>>>really 5 variables that benefit, if you add a piecesquare its 6 variable updates >>>>that have become 2-3 times faster, not bad - but 14 megs!? >>>> >>>>-S. >>> >>>The thing I found was that the cache issue was _the_ problem, along with >>>the small memory pipe on the PCs... That is why I went to the scheme I now >>>use, to avoid the memory references when possible so that cache doesn't get >>>destroyed so badly... >> >>Okay, so wouldn't you agree, that 1 memory reference to a large table should be >>faster than 10 references to 4 minor tables (not even counting a piece square >>table), unless you somehow know that the small tables are in the cache? > >The problem is "large". "large" tends to destroy cache lines everywhere, >so that 4 small ones might well be faster overall... possibly by a large >amount if they fit in L1. > > >> >>It's not "just" fewer operations, it's also fewer references :) > >It depends on the reference space also. IE 4 references into a 16 word >array will be far cheaper than 1 reference into a huge array.. The huge >reference will always be a memory load. The 4 references probably will >also be just one memory read, since a cache line is 32 or 64 bytes, >depending on which processor you use... and once you get that small thing >in cache it might stay. The large thing has no chance... yes, the monster array is 1 load, always, but for sure nothing more. Worst case you may get 10-14 loads with the current method, the zobrist table alone is 6 kb, so you don't load all of it in one go, and you need to access it more than once (unless you also do a delta thing, which is similar to the moveinfo idea of course). Anyways, seems to me 1 load is optimistic in your case, you know with the attack tables also screaming in and out, but I really have no solid intuition about this :) -S. > > >> >>-S.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.