Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A Few Comments

Author: Gerd Isenberg

Date: 11:45:45 01/20/03

Go up one level in this thread


On January 19, 2003 at 14:01:56, Matt Taylor wrote:

>Interesting to note that several of those routines rely on technically undefined
>behavior. Under the bsf instruction, the manual states that, "...If the contents
>of the source operand are 0, the contents of the destination operand are
>undefined." Conveniently it seems that this works on all existing
>implementations.
>
>A similar trick can be used with shifts. Integer shift instructions mask their
>shift count to avoid unnecessary work. As a result, shifting by 32 does not
>change the destination operand.
>
>I probably won't optimize your code for Pentium 4. I was rather annoyed when
>some code I wrote executed about as fast on my Pentium 90 as it would on a
>high-end Pentium 4. All the old tricks are now expensive. Shifting is 4 clocks
>latency. The full adder (adc/sbb) is 2-3 clocks -throughput-. Latency is 6-8
>clocks. The setcc instruction is 5 clocks latency. Every one of these
>instructions has a latency of 1 on Athlon and the original Pentium. They all
>execute with a throughput of up to 3 instructions per cycle (1/3) on Athlon and
>2 instructions per cycle (1/2) on original Pentium. Sigh.
>
>I'll optimize it for Athlon since I am now most familiar with its rules, and I
>have tools to analyze the code. Taking a look now...
>
>-Matt

Hi Matt,

one question to your slightly modified BitBoard(1)<<sq code, you posted
recently:

BitBoard getSquareBB(int sq)
{
	_asm
	{
		mov    ecx, [sq] ; i want to skip this one
		mov    edx, 1
		xor    eax, eax
		shl    edx, cl
		test   cl, 32
		mov    ecx, eax
		cmovz  eax, edx
		cmovz  edx, ecx
	}
}

This works fine so far with MSC6.0. But if i try to use __fastcall, to force
parameter passing via register (first is ecx by convention which would be rather
fine here), the following function succs in release mode.

__forceinline
BitBoard __fastcall getSquareBB(int sq)
{
	_asm
	{
//		mov    ecx, [sq] ; i want to skip this one
// oups but not the right value in ecx generally
		mov    edx, 1
		xor    eax, eax
		shl    edx, cl
		test   cl, 32
		mov    ecx, eax
		cmovz  eax, edx
		cmovz  edx, ecx
	}
}

I found no way so far, to force the compiler with inlined asm-routines, to pass
a parameter via ecx-register. Same for the asm bsf-routines and others. I always
have not necessary store/loads in the begginning of those functions.

0040109F 89 74 24 10          mov         dword ptr [esp+10h],esi
004010A3 8B 4C 24 10          mov         ecx,dword ptr [esp+10h]
// instead of                 mov         ecx, esi
004010A7 BA 01 00 00 00       mov         edx,1
004010AC 33 C0                xor         eax,eax
004010AE D3 E2                shl         edx,cl
004010B0 F6 C1 20             test        cl,20h
004010B3 8B C8                mov         ecx,eax
004010B5 0F 44 C2             cmove       eax,edx
004010B8 0F 44 D1             cmove       edx,ecx

Any hint obout this?

Thanks in advance,
Gerd




This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.