Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: mask of highest bit - additional notes

Author: Gerd Isenberg
Date: 11:20:43 09/22/05
On September 22, 2005 at 12:35:50, Andrew Shapira wrote:

>"There is no way to do that" is a pretty strong statement.  Where's the proof?
>

hmm, i said no cheap way with the operators you mentioned.
I have to admit that my mathematical skills are a bit limited, so i have no
proof but intuition ;-)

It is obvious that a substaction by one borrows from least significant one bit
(LS1B), so LS1B becomes zero and lower zero bits (if any) become one. That's the
reason why it is easy and cheap to extract or reset LS1B, or to get zero-one
chains separated by LS1B.

There is no such arithmetical and/or bitwise boolean or shift instruction that
does the same for the most significant one bit (MS1B), except a special hardware
implementation of reverse add/sub.

---------------------------------------------------------------------------

Imagine we have CPUs without arithmetical (add/sub) instructions - and we have
to implement add/sub in software with bitwise/shift instructions.

Add and carry propagation is done by "xor" and "and". The carries are shifted in
the desired direction and then iterative added again:

uint add (uint x, uint y) {
  p  = x ^ y;   // sum by xor
  g  = x & y;   // carry by and
  while ( g ) { // more carries?
    g <<= 1;    // yes, shift carries left
    x = p ^ g;  // sum by xor
    g = p & g;  // carry by and
    p = x;
  }
  return p;
}

to compute the all carries in parallel prefix manner:

g = x & y; // overflows
p = x ^ y; // sums
c[i+i] = g[i] | p[i]&c[i]

c1 = g0
c2 = g1 | c1&p1 = g1 | g0&p1
c3 = g2 | c2&p2 = g2 | g1&p2 | g0&p2&p1
c4 = g3 | c3&p3 = g3 | g2&p3 | g1&p3&p2 | g0&p3&p2&p1
...

Eg. a resulting Kogge-Stone routine for eight bytewise SIMD add and sub:

BitBoard paddb (BitBoard x, BitBoard y) {
  BitBoard g, p;
  g  = ( x & y);
  p  = ( x ^ y) & 0xfefefefefefefefe;
  g |= (g << 1) & p;
  p  = (p << 1) & p;
  g |= (g << 2) & p;
  p  = (p << 2) & p;
  g |= (g << 4) & p;  // the final carries
  return ((g << 1) & 0xfefefefefefefefe) ^ (x ^ y);
}

BitBoard psubb(BitBoard x, BitBoard y) {
  BitBoard  g, p;
  g  = (~x & y);
  p  = (~x ^ y) & 0xfefefefefefefefe;
  g |= (g << 1) & p;
  p  = (p << 1) & p;
  g |= (g << 2) & p;
  p  = (p << 2) & p;
  g |= (g << 4) & p;
  return ((g << 1) & 0xfefefefefefefefe) ^ (x ^ y);
}

With this kind of software implemented carry propagation, it is also possible
to propagate carries in other directions than from LSB to MSB, eg. from MSB to
LSB by using >> and other wrapmasks (0x7f...) to avoid byte overflows.

---------------------------------------------------------------------------

Since subtracting simd-wise is the fastest approach (with MMX or SSE2) to get
attack sets in one horicontal direction from a set of rooks or queens:

occu  ::= occupied set
rooks ::= rooks/queens subset of occu
rightAttacks ::= occu ^ (occu - 2*rooks)
or
rightAttacks ::= occu ^ ((occu & ~rooks) - rooks)

SEE2 implementation with vectors of two bitboards:

movdqa	 xmm0, xmm2  ;   rooks
pandn	 xmm0, xmm3  ;  ~rooks & occu
psubb	 xmm0, xmm2  ; (~rooks & occu) - rooks
pxor	 xmm0, xmm3  ;((~rooks & occu) - rooks) ^ occu

You may now understand my interest to get carry propagation in reverse or other
directions as well, to have a fast replacement for Steffan Westcott's parallel
prefix Kogge Stone routines for sliding attack generation:

http://chessprogramming.org/cccsearch/ccc.php?art_id=261956

"The other method is inspired by parallel prefix circuits, a hardware design for
fast carry propagation in adders. The Kogge-Stone method lends itself to
convenient implementation in software, here used to propagate sliding chess
pieces rather than carry bits."

-----------------------------------------------------------------------

So i had some sleepless nights about that stuff as well ;-)

Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.