Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Modulo verus BitScan and MMX-PopCount

Author: Jeremiah Penery

Date: 12:09:42 11/30/02

On November 30, 2002 at 14:10:35, Leen Ammeraal wrote:

>On November 29, 2002 at 19:19:02, Gerd Isenberg wrote:
>
>>Times in seconds for 10^9 runs in a loop K7-2.1G:
>>mmx-parallel popcount     19.7
>>two asm bsf pairs v1      24.3
>>mod5811                   26.4
>>two asm bsf pairs v2      26.6
>>64bit % 5811              72.5
>>
>>
>>int BitSearch_v1 (BitBoard bb)
>>{
>>	__asm
>>	{
>>		bsf	eax,[bb]
>>		jnz	found
>>		bsf	eax,[bb+4]
>>		xor	eax, 32
>>	found:
>>	}
>>}
>>
>>int BitSearch_v2(BitBoard bb)
>>{
>>	__asm
>>	{
>>		bsf	eax,[bb+4]
>>		xor	eax, 32
>>		bsf	eax,[bb]
>>	}
>>}
>
>I compared v1 with v2 in a test with
>the positions wac141, wac163 and wac229.
>The outcome was that v2 was faster by
>about two or three percent.
>So v2 was better in my test (on a
>Pentium III, 500 MHz), while v1 was
>better in yours.
>
>Leen

I think it's because BSF is a very slow instruction on the Athlon (12 cycle
latency or something horrible), but very fast on the PIII (one cycle?).  V2 of
the algorithm always does two BSF's, while V1 has the potential to do only one.

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.