Computer Chess Club Archives


Search

Terms

Messages

Subject: Compact encoding of chess positions

Author: Reinhard Scharnagl

Date: 01:36:19 03/04/05


Hi all,

though I would expect that Dieter Bürßner will immediately giving me counter
arguments (he always is arguing very seriously), I want to explain my encoding
approach more detailed here.

Regards, Reinhard.


Compact Encoding of Chess Positions (estimated Average 150.5 Bits)
==================================================================

We handle here valid chess positions of the traditional chess game
and calculate codelengths for a special subset of RELEVANT positions
matching ONE simple restriction (without relevance for practice):

  "For each performed promotion there also must have been captures
  during the game: one pawn or half an officer."

You hardly would find a notation of a played game including positions,
which could not be encoded by this method using at most 164 Bits.

The encoding consists of two parts A and B.


Part A (common data)
--------------------

Here positions of kings, castling rights and e.p. pawns will be encoded:

1) we have three different variants of position encoding:

   a) no king has castling rights (standard)
   b) one king has castling rights
   a) both kings have castling rights

 a) no king has castling rights (standard):

   Bit 0 -  5: white king coordinate
   Bit 6 - 11: black king coordinate

   Bit 12    : active color (w = 0, b = 1)
   Bit 13    : flag if an e.p. pawn exists or not

   code length = 14 Bits

 b) one king has castling rights:

   To mark this situation there will be used two neighboured positions,
   by replacing the coordinate of the castling enabled king by a neighbour

   a) left or right  <=> left castling right  (will include rook's position)
   b) above or below <=> right castling right (will include rook's position)
   c) diagonal       <=> both castling rights (will include rooks' positions)

   Bit 0 - 11: both such coordinates

   Bit 12    : side having castling rights (w = 0, b = 1)

   Bit 13    : active color
   Bit 14    : flag if e.p. pawn exists

   Because the castling enabled rooks has not to be encoded twice later,
   in part B there will be savings of at least three Bits.

   => effective length < standard code length

 c) both kings have castling rights:

   To mark this situation instead of the original kings' positions
   two identic pseudo-coordinates will be used

   Bit 0     : white's left castling right  (will include rook's position)
   Bit 1     : white's right castling right (will include rook's position)
   Bit 2     : black's left castling right  (will include rook's position)
   Bit 3     : black's right castling right (will include rook's position)

   Bit 4     : active color
   Bit 5     : flag if e.p. pawn exists

   Bit 6 - 11: identic to Bit 0 - 5

   Because the castling enabled rooks has not to be encoded twice later,
   in part B there will be savings of at least three Bits.

   => effective length < standard code length

2) e.p. file number (only if need be)

   if an e.p. pawn exists (priorily marked), here will follow:

   Bit 0 - 2 : file number of the affected column

   Because the e.p. pawn has not to be encoded twice later,
   in part B there will be savings of at least three Bits.

   => effective length is unchanged


Teil B (left pieces' data)
--------------------------

All pieces not yet encoded in part A will be encoded here:
(thus no kings, castling enabled rooks or e.p. pawns to encode)

1) Filling matrix: (max 62 Bits)

   For each undefined square will be encoded, whether it is filled or not.

   The enumerating sequence is starting with A1, H8, A2, H7, ... until
   the center of the board will be reached.

2) Color Bits: (max 30 Bits)

   going from square to square marked by the filling matrix one bit is
   added specifying the local used color.

3) Piece codes: (max 58 Bits) (max 30 pieces)

   for each square marked by the filling matrix:

   a) at the base lines (1 and 8) encode:
      Queen =  11, Rook =  10, Bishop =  01, Knight =  00
   b) at the inner lines (2 to 7) encode:
      pawn selction:
        Pawn = 0, else = 1 followed by
      officer selection:
        Queen =  11, Rook =  10, Bishop =  01, Knight =  00


Extra savings
-------------

If 16 piece placements of one color are known, color bits will stop.

If 8 pawns of one color are known, the pawn selection is ommitted for
that color (also at inner lines).

No savings when encoding a 16th piece of one color, because the set
of used officers has not been reduced to the initial piece set to
make the coding scheme more flexible in general.


Coding Length
-------------

1) Maximum length 164 Bits:

If the initial mentioned restriction will hold, there will be a maximum
code length of 14+62+30+58, when all pieces only would be placed at inner
line squares (line 2 to line 7).

2) Looking at promotions:

A promotion will at most increase the bit length by two Bits. In the obove
calculated unshortened encoding, a pawn will need just two Bits, an officer
consumes four Bits. Thus obeying the initial mentioned restriction will
secure that the maximum length of 164 bits will not be exceeded.

3) Average length for 32 pieces 150.5 Bits:

Average code length at officers (16*2 + 48*3)/64 = 2.75;
Average saving of color Bits                     =    2;
Average saving of piece code Bits:               =    6;

This will lead to an average length calculation 14+62+(30-4)+(54.5-6).


Verification
------------

If that encoding would hold that assumptions in encoding positions
of real played games (e.g. from PGN files) that could lead to claim
an upper bound for existing relevant chess positions (without counting
off illegal and technically impossible situations):

Average length:              150.5 Bits
Filling Matrix Redundances:  - 1.1 Bits

=> 2^149.4 = 9.416*(10^44)



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.