Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Question for Gerd: reduce rbb lookup tables

Author: Gerd Isenberg

Date: 23:21:14 07/14/03

On July 14, 2003 at 19:38:40, Sune Fischer wrote:

>On July 14, 2003 at 18:59:16, Gerd Isenberg wrote:
>>
>>Read again, the ShiftL45[s] lookup is not necessary.
>>
>>__forceinline
>>BitBoard A1H8Attacks(unsigned int sq) const
>>{
>>  return sA1H8Atta[sq]
>>   [(*(((BYTE*)&(m_OccuBBA1H8))+((sq-Rank(sq))&7))&0x7e)>>1];
>>   // diaindex = (file-rank) & 7
>>}
>>
>>Gerd
>>
>
>Thanks Gerd, I'll think it over.
>
>My transformation is a little different I think, though you also have one shift
>and one AND, you've just replaced the table with some "magic", could be faster I
>guess.
>
>I once heard that working with byte size variables is slower than native
>integers (32 bit) sizes, so I'm not fully convinced 8 bit casting it that much
>faster(?).
>
>-S.

Natural processor wordlength is almost best. But for extracting an aligned byte
from a bitboard i guess movzx does quite well. 64bit shifting is much more
expensive on x86-32.

Gerd

AMD Athlon Processor TM
x86 Code Optimization Guide page 113

Use the MOVZX and MOVSX instructions to zero-extend and sign-extend byte-size
and word-size operands to doubleword length. Typical code for zero extension
that replaces MOVZX, as shown in Example 1 (Avoid), uses more decode and
execution resources than MOVZX. It also has higher latency due to the
superset dependency between the XOR and the MOV which requires a merge
operation.

Example 1 (Avoid):
XOR EAX,EAX
MOV AL,[MEM ]

Example 1 (Preferred):
MOVZX EAX,BYTE PTR [MEM ]

---

MOVZX reg16/32, mreg8 0Fh B6h 11-xxx-xxx DirectPath 1 Cycle Latency

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.