Author: John Stoneham
Date: 12:31:12 11/03/98
Go up one level in this thread
On November 03, 1998 at 13:05:57, Robert Hyatt wrote: >On November 03, 1998 at 09:44:36, John Stoneham wrote: > >>On November 02, 1998 at 21:45:05, Robert Hyatt wrote: >> >>>a couple of tricks... >>> >>>1. you can use square&56 as the shift amount for the 90 degree rotated bit map, >>>since that will get you exactly the right shift count for any rank... >>> >>>2. you may use a bishop shift value, but you can get away by always using 8 >>>bits from the rotated diagonal bitmaps, if you simply set up the array so that >>>the "unused bits" still produce the right answer. This avoids having a >>>different mask for each diagonal length... >> >>The reason I pre-calculate even the simple shift and mask amounts, is that I >>believe an operation such as ((bitboard & mask[sq]) >> shift[sq]) is probably >>faster than, for example, ((bitboard >> (sq & 56)) & 0x00ff). I may be wrong, >>and I should probably set up a simple test program to see what the actual >>differnce is. > >It should be much slower. mask[sq] has *two* memory referencess. sq&56 has >one. If you do that a lot, it might not matter as you should get a L1 cache >hit (or at least a L2 hit). However, if you look at the above, on a non-alpha >it gets even worse, because bitboard & mask is a 64 bit and operation. the >bitboard>>is a 64 bit shift only... so on a 32 bit architecture the sq&56 >ought to be signficantly faster... > >on the PC's you count memory references *not* instructions. Because the typical >memory delay is 20+ clock cycles... OK. I ran a couple tests using a Pentium 133 (close to a worst-case senario for 64-bit instructions, I guess :) The first test used this form of the routine: ((bitboard & mask[sq]) >> shift[sq]), where mask[] was a 64-bit integer, and shift[] was an 8-bit integer (actually a char type), and tested it's speed in comparison to this form: ((bitboard >> (sq & 56)) & 0xff). The result was that the second form was, on average, about 2% faster, as per your prediction. Then I ran another test against the second form, this time using ((bitboard >> shift[sq]) & mask[sq]) where mask[] was an 8-bit integer type. This was actually faster by about 0.5%. Obviously the change from a 64-bit mask to an 8-bit mask was what made the difference. I don't know enough about raw machine instructions etc., but I suppose that constants still have to be loaded into a register at some point during the operation. In the tests, I used "Release" mode optimizations for the compiler, just to be thorough.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.