Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Bitboard stuff

Author: Gerd Isenberg

Date: 23:28:16 06/22/04

Go up one level in this thread


On June 23, 2004 at 00:02:06, Russell Reagan wrote:

>Assume we are using a 32-bit machine. Let's say that I have a simple macro that
>takes a bitboard and returns a given 8-bit rank of that bitboard. Something like
>this:
>
>#define GetRank(b,sq) ((b >> (sq & 56)) & 255)
>
>Let's say I write this as a function:
>
>unsigned GetRank (Bitboard b, int square)
>{
>    return (b >> (square & 56)) & 255;
>}
>
>In this situation (and any others like it), the return value will always be an
>8-bit value. Is there any reason to prefer that the return type be unsigned char
>(8-bit) or unsigned int (32-bit)?

The native register width (32-bit) is most often preferable.
On x86 partial 8-bit register handling stalls between otherwise independent
registers inside one 16/32-bit register, eg. al and ah.

Even, in your above function the cheap "and 0xff" seems not necessary if
returning al, one needs further movzx later to index some array, since you can't
address memory via partial byte registers.

>
>The only potential issues I can think of are:
>
>1. Converting between types. I don't think this matters as long as the types are
>both unsigned. I think the only time this matters is when you are converting
>from (say) an 8-bit signed value to a (say) 32-bit signed value, since a sign
>extension is required.

For 8->16 or 8->32 bit zero extension (movzx) is about the same effort.
AMD64 implicitly zero extends to 64-bit after a operation on a 32-bit target
register, so that signed int requires explicite sign extension to 64-bit.

>
>2. Cache issues. By using a smaller type (8-bit in this case), is there any
>chance that this will help the cache store more data per cache line? I know that
>if you have a big array of values that can be either 8-bit or 32-bit values,
>then choosing the 8-bit values would probably help, but in the case of the
>return value, I don't think this would make any difference on the effectiveness
>of the cache.

The type of the index has nothing to do with the type in the array elements.
You need a 32(64)-bit register anyway to address that array. On x86 based on the
sizeof array[0] the register may directly be scaled by 2/4/8 and the address
calculation unit without any penalties:

// byte access
    mov    al, byte ptr [array + ebx]
    movzx eax, byte ptr [array + ebx]
// 16-bit word access
    mov    ax, word ptr [array + 2*ebx]
    movzx eax, word ptr [array + 2*ebx]
// 32-bit dword access
    mov   eax, dword ptr [array + 4*ebx]
// 64-bit qword access
    mov   eax, dword ptr [array +     8*ebx]
    mov   edx, dword ptr [array + 4 + 8*ebx]
// AMD64
    mov   rax, qword ptr [array + 8*ebx]
// mmx
    movq  mm0, qword ptr [array + 8*ebx]


For huge arrays it may be more cache friendly to pack them to bytes.
Anyway index register is allways 32/64 byte.

>
>3. Register issues. Since the return value will only require one byte of a
>register, can the compiler can use the other three bytes for something?

There are cases where the compiler may use additional 8-byte register variables.
Due to partial register stalls it is often faster to use 32-bit anyway - even if
that requires some more memory traffic on the stack.

Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.