Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: optimization on reduced int range - e.g. knight-distance

Author: Mridul Muralidharan
Date: 20:11:36 12/19/05

Interesting idea Gerd , I was actually thinking of doing something similar ...
but also mask the king and pawn attack squares into this knight walk (yep ,
would need the knight walk path , see below).
Ofcourse , most cases this might not be useful since you might be able to stop
the knight walk by a well placed pawn push/king move ...
So , the step after that was to find list of unblocked pawns and see how much
they can influence the knight walk path , check the potential kings influence
and see if there is an alternate path for block ... this would make it dirty and
complicated and so not yet thrashed the idea out.
Have not yet completed the idea to even begin coding , so seeing this helps me
know that there is something to fall back on :-)

Thanks,
Mridul

On December 19, 2005 at 16:53:40, Gerd Isenberg wrote:

>Hi,
>
>i am aware of getting a knight-distance of two squares is not that important -
>but entertainment. I am aware of 0x88-difference to index a 15*16 table to get
>distance, taxi-distance, absolute rank-file-difference or even the
>knight-distance which should be conditionally modified for the one-square
>diagonal edge cases, where the knight need four jumps from eg. a1 to b2 instead
>of two from b2 to c3.
>
>In my "original" routine the dense of information keeping two int[64]-arrays for
>knight-distance 0..6 and boolean information of edges was too low for this rare
>code. With the cost of a shift one both arrays are combined to one
>amd64-cacheline.
>
>The use of abs-intrinsic from math.h, performing an branchless abs like
>(x ^ (x>>31))-(x>>31) with cdq-instruction, seems better to optimize by the
>msvc6 compiler than an "handmade" abs.
>
>By using only six or even three (file, rank) significant bits of 32 - instead of
>making an explicite signmask for abs via mov,sar31 or cdq (works only with
>eax,edx) - the bytewise signmask is implicitly in the upper byte register after
>a 32-bit subtraction with eax,ebx,ecx,edx.  Also there is a kind of simd with
>one 32-bit xor.
>
>No way to force my compiler to produce such code with char etc. other than
>inline assembly. While it is usually a bad idea to work with partial
>byte-registers - specially targeting high-byte register before using the whole
>32-bits - it works surprisingly well here.
>
>The C-version is 81 byte and takes 16 cycles, while the inline asm is exactly
>one amd64 cacheline and takes 11 cycles, both including call/return. Since the
>int-variables a,b,c are all used to index a char[64]-array, isn't that
>information appropriate to guide the compiler about reduced value range
>optimizations?
>
>Cheers,
>Gerd
>
>
>#include <math.h>
>
>const unsigned char knedge[64] = {
>  // 2*knight-distance + edge
>  2*0+1, 2*3+0, 2*2+0, 2*3+0, 2*2+0, 2*3+0, 2*4+0, 2*5+1,
>  2*3+0, 2*2+0, 2*1+0, 2*2+0, 2*3+0, 2*4+0, 2*3+0, 2*4+0,
>  2*2+0, 2*1+0, 2*4+0, 2*3+0, 2*2+0, 2*3+0, 2*4+0, 2*5+0,
>  2*3+0, 2*2+0, 2*3+0, 2*2+0, 2*3+0, 2*4+0, 2*3+0, 2*4+0,
>  2*2+0, 2*3+0, 2*2+0, 2*3+0, 2*4+0, 2*3+0, 2*4+0, 2*5+0,
>  2*3+0, 2*4+0, 2*3+0, 2*4+0, 2*3+0, 2*4+0, 2*5+0, 2*4+0,
>  2*4+0, 2*3+0, 2*4+0, 2*3+0, 2*4+0, 2*5+0, 2*4+0, 2*5+0,
>  2*5+1, 2*4+0, 2*5+0, 2*4+0, 2*5+0, 2*4+0, 2*5+0, 2*6+1
>};
>
>__forceinline
>unsigned int absRankFileDiff(unsigned int a, unsigned int b) {
>  return abs((a|7)-(b|7)) ^ abs((a&7)-(b&7));
>}
>
>// precondition a,b < 64
>unsigned int knightDistance(unsigned  int a, unsigned int b) {
>  unsigned int c = absRankFileDiff(a,b);
>  return 2*((knedge[a]^knedge[b]) & (c==9)) + (knedge[c]>>1);
>}
>
>
>// precondition a,b < 64
>unsigned int __fastcall _knightDistance(unsigned int a, unsigned int b) {
>#ifdef _DEBUG
>  __asm {
>    mov ecx, [a]
>    mov edx, [b]
>  }
>#endif
>  __asm {
>    mov eax,ecx
>    mov ebx,edx
>    and ecx,7
>    and edx,7
>    sub edx,ecx ; (a&7)-(b&7)
>    mov cl,knedge[ebx]
>    or ebx,7
>    xor cl,knedge[eax]
>    or eax,7
>    sub eax,ebx ; ((a|7)-(b|7))
>    ; use ah,dh as {0,-1}-signmask for abs
>    add dl,dh   ; abs(x) := (x+signmask)^signmask
>    add al,ah   ; for both differences
>    xor eax,edx ; kind of SIMD
>    xor al,ah   ; c = abs((a|7)-(b|7)) ^ abs((a&7)-(b&7));
>    cmp al,9
>    mov bl,al   ; high bytes of ebx still zero
>    movzx eax,knedge[ebx]
>    sete bl     ; c==9
>    shr eax,1   ; distance >> 1
>    and ebx,ecx ; edge[a]^edge[b] & (c==9)
>    lea eax,[2*ebx+eax]
>  }
>}
Re: optimization on reduced int range - e.g. knight-distance Gerd Isenberg 02:09:09 12/20/05
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.