Author: Roberto Waldteufel
Date: 16:45:28 10/31/98
Go up one level in this thread
On October 31, 1998 at 10:44:25, Robert Hyatt wrote: >On October 31, 1998 at 07:27:43, Bo Persson wrote: > >>On October 29, 1998 at 13:20:45, Roberto Waldteufel wrote: >>[snip] >>>Of course, once I started using assembler on the intel CPU, the bsf and bsr >>>instructions were a god-send and I stopped all this nonsense with modular >>>arithmetic....:-) >>> >>>Best wishes, >>>Roberto >> >>I thought these instructions were microcoded and extremely slow on anything >>newer than a 386! When testing on a Pentium, a table lookup was actually much >>faster for me. >> >> >>Bo Persson >>bop@malmo.mail.telia.com > > >ditto here. But I only tested this on a P5 (non-mmx) when I started doing this >stuff. the instructions were grossly slow. Someone even tried some hand-coded >stuff to help and it was still slower than what I do at present (table lookup). I don't know if these instructions are microcoded or not, but if you guys both say so then I guess they probably are. However, I just ran some tests with a rather surprising outcome. I have recently developed a rotated bitboard implementation for diagonal moves with a view to upgrading my program from ordinary bitboards. Having got it working, I was interested to see how fast it was, so I got it to generate attacks for 10 seconds (sequentially doing all 64 squares in turn repeatedly) and calculated the number of maps generated per second, averaged out over several runs. I used a PII 333MHz (with MMX), and I got almost exactly 400,000 maps per second. Then I tested my old assembler coded method (using lots of bsf and bsr instructions!) for generating the same thing. Since it was easy to do at the same time, I tested also rook and queen map generations using this assembler method. In each case I had to insert two extra instructions (pushad and popad) on account of the loop within which the code ran, so the real values would be very slightly faster than what I measured: Bishop attack maps: 580,000 maps per second Rook attack maps: 560,000 maps per second Queen attack maps: 510,000 maps per second So the assembler version with the bsf and bsr instructions outperformed the rotated bitboard generator by over 25%. I have not tested rooks and queens with rotated bitboards, but I suspect the rooks would be similar to the bishops, and the queens would take double the time using rotated bitboards, whereas my method shows only a slight degradation in speed going from bishops or rooks to the omnidirectional queen. I conclude that either a) the assembler version is very efficient, or b) I screwed up somehow in my rotated bitboard implementation. I am quite ready to believe that (b) was the culprit, but even so, I think the assembler routine is still pretty fast. Has anyone else tested this sort of thing recently? I don't really know how much of a difference the PII technology makes, but unless I can speed up the rotated bitboard method I will stick to the assembler routine. Of course, whichever method is faster, I could still ditch the incrementally updated attack maps and generate on the fly like Crafty does. If bsf and bsr really are microcoded, would I not get much lower counts? Or maybe I ought to be getting even higher than I am? I don't know exactly how fast is fast in this context. Best wishes, Roberto
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.