Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Tables or Calculation

Author: David Rasmussen

Date: 04:24:38 01/19/03

Go up one level in this thread


On January 18, 2003 at 19:12:55, Matt Taylor wrote:

>On January 18, 2003 at 15:57:08, David Rasmussen wrote:
>
>>
>>Mmm. That's weird. My bitboard-based program was as fast if not a little bit
>>faster with MSVC7 then with MSVC6. I can't believe it's library calls.
>
>VC 6 and VC 7 both implement complex 64-bit stuff with library calls. More
>primitive ops (add, subtract, compare) are handled inline as they ought to be.
>In the case of constant << count, the compiler -should- be able to generate good
>64-bit emulation code inline.
>
>Back when I was first implementing the LSB scan functions, I compiled the 64-bit
>binary search algorithm using VC 7. There were several ways to implement a
>"round" of the search. VC 7 chose the worst way.
>
>I'm not terribly suprised, though. 64-bit computation isn't common, and the
>compiler likely has not been tuned to produce super 64-bit code.
>

I'm surprised, because from the beginning, MSVC (6 or 7) always produced the
fastest output for me. Way faster than Borland and GCC and Sun. Intel is about
as fast. With profiler guided optimization, Intel wins. So either all other
compilers are doing something even worse, or you are wrong.

>>
>>I don't know, but I guess I should check it's generated code.
>>
>
>Probably should. I'm betting they use a library call. If not, they may use a
>combination of shld (slow) and logical ops to accomplish the shift. The assembly
>I posted should be ideal for you as it avoids the table (even though the table
>is small) and runs just as fast. Downside is it doesn't run on old, old machines
>(like original Pentium/original K6).
>

For the table version (MSVC7):

; 56   : INLINE BitBoard Mask(Square square) { return mask[square]; }

	mov	ecx, DWORD PTR _square$[esp-4]
	mov	eax, DWORD PTR ?mask@@3PA_KA[ecx*8]
	mov	edx, DWORD PTR ?mask@@3PA_KA[ecx*8+4]
	ret	0

For the shift version:

; 55   : INLINE BitBoard Mask(Square square) { return BitBoard(1) << square; }

	mov	ecx, DWORD PTR _square$[esp-4]
	mov	eax, 1
	xor	edx, edx
	jmp	__allshl

So it seems you are right :)

With Intel C++, I can't seem to find the definition of the inlined Mask() in the
bitboard.h file, where it was when compiler with MSVC. I don't know where Intel
puts inlined definitions in assembly output. So I made a dummy file and
function, and tried to trick Intel C++. I didn't succeed:

;;; 	Square s = E4;
;;; 	++s;
;;; 	BitBoard b = Mask(s);
;;; 	return PopCount(b);

        mov       DWORD PTR [ebp-8], 536870912                  ;8.18
        mov       DWORD PTR [ebp-4], 0                          ;8.18

It understands that b is constant, and it has calculated it at compile time.
Making another dummy function ( takes square, return Mask(square); ) reveals
this:

;;; 	return Mask(s);

        mov       eax, 1                                        ;5.14
        xor       edx, edx                                      ;5.14
        mov       ecx, DWORD PTR [esp+4]                        ;5.14
        call      __allshl                                      ;5.14
                                ; LOE eax ebx ebp esi edi
.B1.5:                          ; Preds .B1.1
        ret                                                     ;5.14
        ALIGN     4

So Intel uses calls too!

I wonder how much could be gained here!

I will post an assembly programmers challenge in another thread :)

/David




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.