# Computer Chess Club Archives

## Messages

### Subject: Algorithm for Encoding Any Position in 155 bits (part 1/2)

Author: Ratko V Tomic

Date: 00:37:05 11/15/99

Go up one level in this thread

```  -------------------------------------------
CODING ANY LEGAL CHESS POSITION IN 155 BITS
-------------------------------------------

> 2) How do you decode such a monster?

Although I didn't originally plan to work out a practical and
detailed encoding/decoding algorithms (I only knew, from the
general programming experience, that the framework described
could be implemented efficiently in time & space), now that you
wish to be convinced beyond any doubt, I'll describe the
algorithm in enough detail for anyone to try it out (I have no
idea why would anyone wish to do so, but here it is anyway).

The full position code FPC was a fixed length (155 bit without
castle or EP; around 157 with these included) code, consisting
of 2 variable length components, the Material Content (MC, i.e.
material counts) part and the Material Placement part (MP), or
symbolically: FPC=(MC,MP(MC)), where we use MP(MC) to denote
that the placement encoding is dependent on the given MC state.
The MC part was a Huffman coded index into an array of all MC
states (there are 58,084,310 distinct MC states). The MP part,
for a given MC, was an index into an array of all distinct
permutations (placements) of the given (by this MC) material.
These (conceptual only, of course) arrays, one for each of the
58 million MC states, would have as many as 3*10^42 entries, or
in total 2.46*10^46 entries for all such arrays.  This
conceptual "algorithm" is obviously impractical due to the huge
memory demands.

For clarity, the description of the practical algorithm is
divided into 3 parts (phases) and  the encoding + decoding are
covered in parallel in each phase. Part 1 covers
encoding/decoding  of the MC part to and from a fixed length
(26 bit) binary coded MC index (MC26 index). Part 2 decribes
how, for a given MC and any one of its placements, to produce
an MP index and how, given an MC and any of its index values to
reconstruct the placement. Part 3 describes Huffman (de)coding
of the MC26 index, in conjuction with the MP part of the code,
resulting in the fixed length FPC-155 code.

The algorithm is "practical" in the sense that it will complete
on any PC and encode/decode hundreds of positions in a "blink
of an eye," although it certainly could be optimized further
(if one were to convert it into a C code).

Notation:

BINOMIAL COEFFICIENTS:

{n;k} = n*(n-1)*...*(n-k+1)/k! = n!/(k!*(n-k)!) = {n,n-k}
{n;n} = {n;0} = 1
if n<k --> {n;k} = 0

MULTINOMIAL COEFFICIENTS:

{n;k1,k2,...,km} = n!/(k1!*k2!...*km!*(n-k)!) =
= n*(n-1)*...*(n-k+1)/(k1!*k2!*...*km!)
where k=k1+k2+...+km

a % b  is a modulo b
a \ b is integer division (or a\b = floor(a/b) = (a-a%b)/b)

1. INDEXING MATERIAL CONTENT
----------------------------

The MC state is specified via a record of 10 counts for the 10
pieces MC2 = (QRBNP,qrbnp) which have changeable counts. The
construction in the original post would first build an array of
MC states for one side only, MC1 = (QRBNP), under the
constraint that the total excess in the counts of Q,R,B,N plus
the number of pawns cannot exceed 8 (since each excess piece
could only come from the promotion). The resulting array MA1[]
had 8694 distinct states (13.1 bits index). The two side MC
state array MA2[] was constructed by pairing elements of MA1[]
among themselves as MC2 = (MA1[w], MA1[b]), where w index would
run 0..8693 and b index from 0..w (the case b>w is identical
regarding legality & # of placements, so it wasn't explicitly
recomputed, just the results of the sums were doubled for all
w>b cases), and for each pair the promotions/capture costraint
was checked:

wx <= 2*bpd + bnd + wd   (1w)
bx <= 2*wpd + wnd + bd   (1b)

where wx (or bx) is excess in white (black) pieces, wc (bc) is
the total white (black) count defect, bpd (wnd) is black
(white) pawn defect, and bnd (wnd) is black (white) nonpawn
defect. The terms excess and defect were covered in detail in
another post in this thread (which also deduced eqs. 1w,b),
briefly the definitions are:

D-1. The EXCESS of a piece is the piece count excess above its
initial state (e.g. 5 W.Queens have WQ excess 4), and it is
0 if piece count is not above the initial count. Only nonpawns
can have excess.  The sum of excesses for all white (black)
pieces is wx (bx).

D-2. The DEFECT for a nonpawn is the defect in the piece count
below its initial count (e.g. 0 W.Rooks have W.Rook defect 2;
1 W.Rook has defect 1; 3 W.Rooks have defect 0). The sum of
white (black) nonpawn defects is wnd (bnd).

D-3. The DEFECT for white (black) pawns is the number of
missing pawns  below the number 8-wx (or 8-bx) implied by
the excess for white (black), e.g. wpd = 8-wx-wnp  (where wnp
is number of white pawns).

D-4. The total white (black) DEFECT is wd=wnd+wpd  (or bd=bnd+bpd).

Important point is that all these excesses and defects are
computed directly from the counts in a given MC state, without
speculating on number of actual captures and promotions (which
are not uniquely determined from the MC counts).

The number of all (w,b) pairs is 8694^2 = 75,585,636 but the
constraint 1 was satisfied by only NMC = 58,084,310 pairs i.e.
26 whole bits (25.7916 bits) is needed for indexing. If we were
to fill out entire MCA2[NMC] array, we could easily decode any
MC26 Index  i (i=0..NMC-1) by simple lookup in MCA2[i] to
retrieve the MC2 state for this i. The encoding (i.e. producing
an index i from a given MC) would involve searching for a given
MC in MCA2[NMC] (using log2(NMC) comparisons) and the index
where the matching MC was found is our MC26 index i.

Had we not eliminated the elements from the (w,b) pairs, which
left scattered holes in the square array (w,b), we could have
used for both encoding and decoding only the MCA1[8694] array,
constructing the needed (w,b) pairs on the fly. Namely, to
encode (MCW,MCB) state, we would do 2 binary searches in the
MCA1[], obtaining w and b one side indices, then compute:

i(w,b)=8694*w+b       (2)

The decoding of i would be done as:

b=i%8694, w=i\8694   (3)

We can still do this, but this index would require 26.1716 bits
or 27 whole bits (instead of 26 whole bits) and would require
checking of the pair for legality (under the pair constraint
(1w,b)) before computing the placement index.

If the holes left by the constraint (1) in the square (w,b)
array were more regular e.g. if for each w index in the pairing
loop, the accepted pairs had a contiguous range of the b index,
say 0..bmax, then we could still apply the method (2) for
encoding by replacing in (2) the term 8694*w with an array
lookup OFFS[w], where the OFFS[0..8693] would contain code
offsets for each w index.

In fact, we can sort our original MCA1[] array so that exactly
this type regularity does hold. Namely, if we rewrite constraint
(1w,b) as:

wx - wd <= 2*bpd + bnd   (4w)
bx - bd <= 2*wpd + wnd   (4b)

we have the white and the black parameters on different sides of the
inequality.  If we now sort the array MA1[] in the decreasing order of
the value 2*bpd+bnd (which is the right hand side of (4w)), and for a
given w, run the b index from 0 and up, as soon as the first failure of
(4w) occurs, say at some index B1, all the remaining values of index b
(from B1 to 8693) will violate (4w) automatically. Namely, the right
hand side of (4w) can only stay the same or get lower as b index
increases (due to the sorting in decreasing order), hence if for b=B1
(4w) is violated it will stay violated for b>B1. If no violation
occured for any b, we will set B1=8694, so that each w will have its
own B1.

The total number of pairs produced this way, i.e. under the (4w)
constraint but without the (4b) constraint, is 62,096,507 = 25.888009
bits, which is 0.096 bits more than 25.791645 bits for the 58,084,310
pairs optimal array (obtained when both (4a) and (4b) are applied). But
in both cases the count fits in a 26 whole bits binary index. Note also
that we still maintain the full constraint (4w) and (4b) at our encoder
main entry, i.e. we return an error when we get an MC state which
violates either inequality. We also maintain the full constraint when
enumerating all possible placements, hence all the figures for
placement counts for all fully valid MCs, and the resulting Huffman
code weights (hence the MC Huffman code lengths) remain the same for
all fully valid MC states. Since the MC26 code is only an intermediate
code (being turned into a Huffman code, which remained same as for the
full constraint), the only effect of this relaxation is that some
numbers (MC26 binary index values) in the intermediate calculations
gain 0.096 bits, which given the use of whole bits (which remain under
the 26 whole bits) here doesn't produce any practical difference (other
than slightly enlarging the Huffman code tables, as we will see in part
3).

We now continue with the encoder/decoder construction. The value of B1
varies as w index varies (the outer loop index). As we check all pairs
this way, we compute and store B1 values (one obtained for each w
index) in an array B1[0..8693]. Then we compute cummulative values of
B1's into an array CB1[0..8693], where CB1[0]=0, ..., CB1[w+1] =
CB1[w]+B1[w], i.e. the CB1[] array will have 0, B1, B1+B2, B1+B2+B3,
..., B1+B2+...+B8692 as its elements. The encoding of MC into MC26 code
then works as follows:

E1) We obtain input MC state as (MCw,MCb), we check it
against the (4w,b) constraints and return an error if it
violates either.

E2) Then we search in the one side MC array (the MA1[0..8693] array
containing all valid one side MC records, which are the piece
counts) the MCw and Mcb states, and say we find them at positions
w and b. (If either is not found, the one side state is illegal
and we return an error).

E3) We compute the MC26 code as MC26 = CB1[w]+b.

(The followup Huffman encoding of MC26 is in part 3)

The decoding of MC26 works as follows:

D1) We obtain MC26 (from Huffman decoder, given in part3)
and do a binary search in the CB1[0..8693] array to find
the nearest lower (or equal) value to MC26 (this will take
up to 14 comparisons). The index at which this was found is,
say, w (it will always be found since it doesn't have to
match excatly MC26).

D2) The b index is then: b = MC26-CB1[w]. Since CB1[w] is the
nearest lower/equal CB1[] element to MC26, and since the
difference between adjacent CB1[] elements is at most 8694,
the value of b will always be in range 0..8693.

D3) Using these w and b we retrieve MCw=MA1[w] and MCb=MA1[b] to
obtain the full MC state (MCw,MCb).

So we have a quick way to get from the MC state to the MC26 index and
back, using the 2 arrays of 8694 elements each, MA1[] (using at most 17
bits per entry) and CB1[] (using at most 26 bits per entry). These
arrays are precomputed and are part of the program's constants (either
loaded as data files or included into the code as preinitialized arrays
of constants), hence their construction time isn't a part of the
encoder/decoder execution time.

2. INDEXING MATERIAL PLACEMENTS
--------------------------------

For a given MC state there will be NP different piece placements. The
task here is to compute index IP (IP range: 0..NP-1) for a given
specific placement MP. Alternatively, given IP, decoder must
reconstruct (uniquely, of course) the specific placement MP. There is a
more generic subtask here, which is: given some piece type with count M
and with NS available squares for placing the M copies of the piece,
find the indexing for all the unique placements of the M copies. Once
we have solved this problem, we can use variable radix coding to encode
the combined index for all piece types present, hence produce the the
MP(MC) part of the full code. The variable radix scheme will use the
maximum arrangements for each piece type (in given order of types) as
the radix for a given position, while the index value for a specific
placement of that piece type will be a digit in that place. We'll
tackle the generic sub-problem first.

2.1 INDEXING ARRANGEMENTS OF A SINGLE PIECE TYPE
-------------------------------------------------

We have M copies of a single piece type and NS squares (NS>=M) to place
this piece. The number of distinct placements is:

NP = NS*(NS-1)*..*(NS-M+1)/M! = {NS;M}    (5)

We will order our M pieces by placing piece P1 on square S1, P2 on
S2,... PM on SM, where no 2 pieces can go on one square. We will use
labeling convention where the following is satisfied:

1 <= SM < ... < S2 < S1 <= NS

i.e. if we imagine NS squares in an array 1,2,...NS then P1 is placed
rightmost (the front piece), P2 to the left of it, and the PM last to
the left of the rest (the tail piece). The encoder's objective is,
given a position list for the M pieces, SP=(SM,..,S2,S1), to obtain a
number I = f(SP) such that 0<=I<NP and that any two different SP1 and
SP2 always give a different I1 and I2. Decoder has to reconstruct for a
given I the unique SP(I) list.

This problem has an obvious recursive nature. Namely, if we look at the
fixed position of the front piece P1, the remaining pieces replicate
the same problem, but on smaller scale, with M1=M-1 pieces and on
NS1=S1-1 squares. Conceptually, the solution to the indexing problem
could work as follows: if we have some array of index base values IB[]
for each valid position of the P1 (which is M,M+1,...,NS), then the
Index for SP(NS,M) would be IB[S1] + Index(SP(S1-1,M-1)). The base
values IB[S1] would be constructed by adding up numbers of used up
index values for P1 positions lower than S1. Before giving the general
formula for the index, let me illustrate the method on 3 rooks
placements on an empty board (NS=64, M=3). The placement I=0 is 123,
i.e. the squares for the rooks are: S1=3, S2=2, S3=1. Here are several
more in a table, arranged by the front rook:

S1  IB   NA   Arrangements
-------------------------------------
3   0    1   123
-------------------------------------
4   1    3   124 134
234
-------------------------------------
5   4    6   125 135 145
235 245
345
-------------------------------------
6  10   10   126 136 146 156
236 246 256
346 356
456
-------------------------------------
7  20   15   127 137 147 157 167
237 247 257 267
347 357 367
457 467
567
-------------------------------------
8  35   21         128 ..... 178
.............
678
-------------------------------------
k {k-1;3} {k-1;2}  12k ... 1(k-1)k
...............
(k-2)(k-1)k
-------------------------------------
64  {63;3} {63;2}

The NA coulumn gives number of arrangements for the given front rook
position S1. The IB column is the cummulative value for NA up to (but
not including) a given S1. Our index for, say the entry 257 (S1=7) is
20+1+2+3+1=27, i.e. we take IB which is 20, then going column by column
(using thus rook 2 as the front rook) within the S1=7 group we count
how many arrangements came before the 257. To decode 27 into 257
arrangement, we first search in the IB array for the nearest IB[x]<=27
and IB[x+1]>27. This returns IB[7]=20, thus our front rook is at S1=7.
The sub-table for S1=7 is the same kind of arrangement table, except
that it is for 2 rooks on 6 squares, and flipped horizontally. The NA,
IB and S2 values here are:

S2  2  3  4  5  6
NA  1  2  3  4  5
IB  0  1  3  6 10

for the S2 values 2 3 4 5 6. The 2-rook index, which was offset by 20
for the 3 rook index, is 27-20=7. So we look in the 2 rook table for
the nearest IB<7, which is 6, for S2=5. The remaining 1-rook code is
7-6=1, and we have for the 1 rook table:

S3 1 2 3 4
NA 1 1 1 1
IB 0 1 2 3

We look for the nearest IB<=1, and find it for S3=2, giving the full
placement S3,S2,S1 as 257. Now we'll look how to create these types of
tables in the general case, (NS,M).

For M pieces on NS squares, the NA coulumn gives for each S1 (the front
piece position, which goes from M to NS) the number of arrangements of
the remaining M-1 pieces on S1-1 squares, hence using eq.(5):

NA(S1) = {S1-1;M-1}     (6)

The IB column for S1 is a cummulative NA() value up to the preceding S1
value, i.e.  IB(S1) = IB(S1-1)+NA(S1-1), for S1=M+1...NS and IB(M)=0,
for 0 based index. Expanding the recursion into a sum:

IB(S1) = Sum(j=M..S1-1; NA(j)) (7)

for S1>M and IB(M)=0 (by definition).

Using the binomial coefficient formula for summation on the upper
binomial coefficient index:

{n;n}+{n+1;n}+{n+2;n}+...+(m;n} = {m+1;n+1}   (where m>=n)

we get, IB(S1) = Sum(j=M..S1-1; {j-1;M-1}) = {M-1;M-1} + {M;M-1} + ..
.. + {S1-2;M-1} = {S1-1;M}, i.e.

IB(S1) = {S1-1;M}         (7)

Our cummulative index base is simply a binomial coefficient. The full
index is obtained recursing down to M=1 and adding all (lower M) IB(Sk)
contributions, yielding:

I(S1,S2,..SM) = Sum(j=1..M; {Sj-1;M-j+1})   (8)

= {S1-1;M} + {S2-1;M-1} + ... + {SM-1;1}

(Note that (8) assumes that by definition the terms of type {n;k} with
n<k are 0, e.g. if S3=1 or S2=2 or S1=3 in our M=3 rook example, then
the three terms are {2;3}=0 {1;2}=0 and {0;1}=0, giving I(3,2,1)=0.)

Decoding of I(S1,..,SM) works by finding x for the largest {x;M} < I,
the obtained {x;M} gives us the leading term in (8), with x=S1-1 i.e.
S1=x+1. Then we reduce I by this {x;M} i.e. assign I1=I-{S1-1;M} and
repeat this for the next term in (8), obtain S2, etc. The uniqueness of
decoding follows from observing how we obtained (7) & (8), i.e. by
noting that any term in (8) is greater than the sum of all the terms to
the right of it (the closests the two ever come together is if the
S1-S2=1, S2-S3=1, S3-S4=1,.. in which case {S1-1;M} is greater from the
sum to the right of it in (8) by exactly 1}, which makes this a sort of
number base with S1, S2,.. being its digits (S1 the most significant,
SM the least). The sum (8) achieves maximum when all S values are at
maximum, i.e. S1=NS, S2=NS-1, ... Sk=NS-k+1. In that case the sum (8)
is exactly equal {NS;M}-1, meaning that index values cover the required
range 0..NP-1, where NP was given by (5).

Searching for {x;M}<I we use in decoding involves a short binary search
in the table of {x;M} with x<=64 (which should complete in at most 6
comparisons). A precomputed 64x64 table could serve encoding & decoding
needs (since M is always given, we could make array of 64 pointers
indexed by M, each entry giving us a pointer into an array of 64 {x;M}
values).  If one were doing other than chess enumerations with this
combination indexing method, with large x values, and a machine has a
fast floating point, one can use estimate x ~~ M/2+(I*M!)^1/M (the m-th
root finds roughly the middle of the range of M multipliers in the
numerator of the {x;M}) to get us very near the correct x.

We can illustrate (8) by encoding our earlier example 257 (S3=2, S2=5;
S1=7, M=3):

I(7,5,2)={7-1;3} + {5-1;2} + {2-1;1} = 20+6+1 = 27

Decoding od 27 starts with looking for {x;3} nearest to 27, not
exceeding 27. Value {6;3}=20 and {7;3}=35, hence x=6 or S1=x+1=7. Then
I1=27-20=7. We look for {x;2}<=7 which from {4;2}=6 {5;2}=10 gives x=4,
hence S2=x+1=5. Then I2=6-5=1 and we look for {x;1}=x<=1, hence x=1 and
S3=x+1=2, i.e. the placement for the index 27 is 257.

2.2 INDEXING MULTIPLE PIECE TYPES
---------------------------------

Now we have k piece types, P1, P2 to Pk, and for each we have a count
M1, M2,.., Mk and the squares available, NS1 squares for the  P1,
NS2=NS1-M1 for P2, NS3=NS2-M2 for P3, etc. For this particular
convention of placing the pieces, the number of arrangements for P1 is
NP1={NS1;M1}, for P2 NP2={NS2;M2},... NPk={NSk;Mk}. The total number of
arrangements for the k types is NP=NP1*NP2*...*NPk.

Some particular arrangement of these k piece types will produce k
independent index values I1, I2, ... Ik (using the indexing method from
part (2.1)), and these index values have property: 0 <= Ij < NPj  for
all j=1,2..,k. To represent the full index I for all k piece type, we
will use I1,..Ik as k digits in a mixed radix representation, where the
1st digit is in radix B1=NP1, second digit in radix B2=NP2,.. , k-th
digit in radix Bk=NPk. Hence we form the full index for all k piece
types as:

I = I1+B1*(I2+B2*(I3+...+B[k-2]*(I[k-1]+B[k-1]*Ik)..) (9)

Where the range of the MP index I is 0..NP-1.

This is the same type of representation as for, say, a 4 digit decimal
number d with digits (d4 d3 d2 d1) as:

d = d1+10*(d2+10*(d3+10*d4)) = d1 + d2*10 + d3*10^2 + d4*10^3

Decoding of index I works in exact analogy with conversion of binary to
decimal radix, i.e. from given I we obtain "digits" I1, I2, ... using
the steps:

I1=I%NP1, I=I\NP1,                       (10)
I2=I%NP2, I=I\NP2,
....
I[k-1]=I%NP[k-1],  I=I\NP[k-1],
Ik=I

The obtained values I1, I2,..Ik would be decoded as the placement index
I in section (2.1). Note also that each subsequent piece uses the
left-over square array from the previous piece, hence the S1,S2,..,SM
in the section (2.1) refer to the empty square holes, 1 denoting the
1st hole, 2 the second hole, etc.

In using the mixed radix method for a given MC state, the pawns, which
can occupy only the 48 squares instead of 64 should be encoded in the
1st 2 "digits" if both color pawns are present, or as 1st digit if only
one color pawns are present. If we have wp and bp white and black
pawns, and n3,n4,n5,..,nk of other pieces, and we denote p=wp+bp, then
in the scheme above we will have:

NS1=48,     M1=wp,
NS2=48-M1   M2=bp,
NS3=64-p    M3=n3,
NS4=NS3-n3  M4=n4,
...
NSk=NS[k-1]-n[k-1]  Mk=nk.

To minimize the size of the NP numbers (the radix values), it is best
for a given MC, after dealing with pawns (which always come 1st if
present) to sort the counts n3,n4,.. in increasing order, so that the
largest n's come with the least NS number. Thus n3=1 and n4=1 would be
set for kings, then the rest of pieces. The worst case (for a size of
an individual NP) occurs when we have only the 2 kings and 10 copies of
a single piece (R,B or N). In that case the NP for this piece type is
{62;10}=1.1*10^11 = 36.6 bits, i.e. we cannot generally use 32-bit
integer for the NPs, but the 64-bit integers will work safely. The full
index value in (9) is much larger, the worst case being a 30 piece
MC={QRBNP=3,3,2,4; qrbnp=3,3,2,4} with number of placements and index
0<= I< NP=3.0678*10^42 = 141.14 bits i.e. at least 18 byte integers are
needed for to handle the mixed radix arithmetic in (9) and its decoder
in (10).

-- continued in 2/2

```