Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: a question about speed difference that I do not understand

Author: Dieter Buerssner

Date: 12:21:37 12/05/01

Go up one level in this thread


On December 05, 2001 at 14:45:54, Ed Schröder wrote:

>On December 05, 2001 at 14:11:34, Dieter Buerssner wrote:
>
>>On December 05, 2001 at 13:55:19, Ed Schröder wrote:
>>
>>>Well, I never leave it up to the compiler to decide things for me but that's
>>>easy becausese my background is assembler and my way of programming in C++
>>>still is as close to assembler as possible. In your case I would program it
>>>as follows:
>>>
>>>static char direction [65*65];       // create an area big enough, using char
>>
>>For zero based squares, this data structure is too big. 64*
>>64 is enough. Also, becuase 64^2 is a power of two, it may give better >alignment for following data.
>
>See my other posting about the why taking somewhat overhead space. It is
>harmless for the speed of a program, allignment has nothing to do with
>it, I just waste some memory space to avoid bugs.
>
>
>
>>>To access the table:
>>>
>>>int x,y, char result;
>>>
>>>result = direction [x<<6][y];
>>
>>This is an syntax error. You probably mean direction[x<<6+y].
>
>You are right, a bug :)
>
>
>>>That is all. Any good compiler will produce fast code, something like:
>>>
>>>mov    EAX,x
>>>mov    EBX,y
>>>shl    EAX,6
>>>add    EAX,EBX
>>>mov    CL,direction[EAX]
>>>mov    result,CL

[snipped]

>I can not share your opinion although your point about the lea instruction is
>true and my above listed assembler code surely can be improved.
>
>The net gain to move C to ASM is still 30% if not more. I have checked MSVC 6
>and the ASM code it generates is to cry about.

I won't argue with this. I have functions (not for chess, for high precision
arithmetics), that are a factor of 10 faster in assembler (mainly because there
is no access to the carry bit from high level laguage).

However, I stay with my suggestion about the low level tricks. I just tried
exactly this example with the 2 dimensional array and with the one dimensional
array and the shift like you suggested. I expected no difference.

static unsigned direction[64][64];

unsigned foo(int i, int j)
{
  return direction[i][j];
}

static unsigned dir2[64*64];

unsigned bar(int i, int j)
{
  return dir2[(i<<6)+j];
}

Assembler output:

_foo    PROC NEAR
; File foo.c
; Line 4
        push    ebp
        mov     ebp, esp
; Line 5
        mov     eax, DWORD PTR _i$[ebp]
        shl     eax, 8
        mov     ecx, DWORD PTR _j$[ebp]
        mov     eax, DWORD PTR _direction[eax+ecx*4]
; Line 6
        pop     ebp
        ret     0
_foo    ENDP
_TEXT   ENDS
PUBLIC  _bar
_BSS    SEGMENT
_dir2   DD      01000H DUP (?)
_BSS    ENDS
_TEXT   SEGMENT
_i$ = 8
_j$ = 12
_bar    PROC NEAR
; Line 11
        push    ebp
        mov     ebp, esp
; Line 12
        mov     eax, DWORD PTR _i$[ebp]
        shl     eax, 6
        add     eax, DWORD PTR _j$[ebp]
        mov     eax, DWORD PTR _dir2[eax*4]
; Line 13
        pop     ebp
        ret     0
_bar    ENDP

So, would you still think, the shift trick is worthwhile? When i and j are
already in registers, the more straightforward aproach may even be faster.

; res = direction[i][j]
        mov     edx, eax ; i is in eax
        shl     edx, 8
        mov     edx, DWORD PTR _direction[edx+ecx*4] ; assume j is in ecx;
                                                     ; result goes to edx;


I can at least see no way, that it would be slower here.

BTW. Similar results, when changing the type to unsigned char.

Regards,
Dieter



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.