Author: Gerd Isenberg
Date: 10:47:04 06/21/02
Go up one level in this thread
On June 20, 2002 at 17:10:46, Tim Foden wrote: >I have tested this routine (find bit and remove it) in Green Light, but it >appears to be slower than the current routine I am using. I guess this must be >because the BSF and BTR instructions are slow on my Athlon XP 1.46Gz. It may >also be because, for instance, when generating moves for white, most of the >pieces will be in the low DWORD, and most of the moves will be made to the low >DWORD, so I think that in general my routine will correctly predict the branch, >and will take the short route through the code. > ...snip > >Cheers, Tim. > >BTW, here is the code from my current routine :) > >inline Square BitScanForwardRemove( UINT64& bitmap ) >{ > __asm > { > mov ebx, [bitmap] > mov edx, 1 > > bsf ecx, [ebx] > mov eax, 0 > jnz found > > bsf ecx, [ebx + 4] > mov eax, 32 > lea ebx, [ebx + 4] > jnz found > > mov eax, XX > jmp done > > found: > shl edx, cl > add eax, ecx > xor [ebx], edx > > done: > } > > // value returned in eax >} > > >Cheers, Tim. Thanks Tim for sharing your code. I tried it and it's definitely faster (at least on AMD1,4GHZ). I only have first impressions, and have to do some profiling. In one testpositions so far a have a solution time of 102 sec instead of 115 sec (may be influenced by some other chaotic optimizing or whatever effects). Even if the number of used registers is larger in your inline routine and therefore it may more difficult to hold vars of the calling routine in registers, it's faster - very interesting. It seems that this vector path instructions (bsf, btr) are very inefficient on Athlons. Cheers, Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.