Author: Jeremiah Penery
Date: 12:09:42 11/30/02
Go up one level in this thread
On November 30, 2002 at 14:10:35, Leen Ammeraal wrote: >On November 29, 2002 at 19:19:02, Gerd Isenberg wrote: > >>Times in seconds for 10^9 runs in a loop K7-2.1G: >>mmx-parallel popcount 19.7 >>two asm bsf pairs v1 24.3 >>mod5811 26.4 >>two asm bsf pairs v2 26.6 >>64bit % 5811 72.5 >> >> >>int BitSearch_v1 (BitBoard bb) >>{ >> __asm >> { >> bsf eax,[bb] >> jnz found >> bsf eax,[bb+4] >> xor eax, 32 >> found: >> } >>} >> >>int BitSearch_v2(BitBoard bb) >>{ >> __asm >> { >> bsf eax,[bb+4] >> xor eax, 32 >> bsf eax,[bb] >> } >>} > >I compared v1 with v2 in a test with >the positions wac141, wac163 and wac229. >The outcome was that v2 was faster by >about two or three percent. >So v2 was better in my test (on a >Pentium III, 500 MHz), while v1 was >better in yours. > >Leen I think it's because BSF is a very slow instruction on the Athlon (12 cycle latency or something horrible), but very fast on the PIII (one cycle?). V2 of the algorithm always does two BSF's, while V1 has the potential to do only one.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.