Author: Larry Serflaten
Date: 07:13:17 11/15/99
At the invitation of "Les", I will attempt to describe the Arithmetic coding technique as it pertains to storing the chess piece positions on a chessboard. Further down is source code in Visual Basic for those who may want to study the implementation (so far). Encoding: This method uses numbers in the range of 0 to 1, so that, any output from the encoding process will always be a decimal (fractional) value. Where that value falls in the range of 0 to 1, has an infinate number of possibilities, because we can always add more numbers to the end of the decimal value, (for more encoded symbols) and still reside between 0 and 1. Between 0 and 1, the entire list of symbols is assigned a range of values. These symbols represent the different chess pieces, SPACE, Castle enabled Rooks, and En passant Pawns, etc. Reading from the board sequencially (A8 - H1) the SPACE or other symbol is sent to ecoding depending on what piece occupies the curent square. The encoded message would always contain the 64 symbols that indicate what occupies each square. If, for example, there are 20 (for convenience) symbols, each symbol would map to 5% or .05 of the entire range (0 to 1). The assignment might start out as follows: SPACE .0 - .049999... W King .05 - .099999... W Queen .10 - .149999... W Bishop .15 - .199999... W Knight .20 - .249999... W Rook .25 - .299999... ... Black's Turn .90 - .949999... End of Data .95 - .999999... (.9999... Is 'identical' to 1) Note that we can add 0's after the low range values and still maintain accurate precision. Likewise for the upper range values, we can add 9's on to the end of those numbers and get even more precise results. This will later allow us to add more precision, as more and more symbols are encoded. After encoding the first symbol (W Rook for example) the range is narrowed to whatever size W Rook's range is and the 20 symbols are remapped into the new range: W Rook's range was .25 to .299999..., now all the symbols must fit within that range and are remapped giving each 5% of the new range: SPACE .2500 - .252499... W King .2525 - .254999... W Queen .2550 - .257499... W Bishop .2575 - .259999... W Knight .2600 - .262499... If the next symbol is the W Knight, the new range would be from .26 to .262499. and all the symbols would be remapped to the new range. If for example, this was the last (64th) symbol, the output is the lower value, IE: .26 Decode: Starting again with the original symbol ranges, we can determine what the first symbol is by finding out what range the coded number falls into: .26 falls between .25 and .29999... which is the range assigned to W Rook. If we subtract out W Rook's lower range value, (.26 - .25 = .01) and divide by the range size (.05) we get a new number (.01 / .05 = .2) which is again matched against the symbols' ranges. .2 belongs to W Knight, which was the second symbol encoded. Subtracting out W Knight's .2 leaves us with 0, so either a SPACE is the next symbol, or we're done with decoding. Using an End Of Data symbol allows us to be certain where the end is. Another method could rely on there being exactly 64 symbols.... Comments: If, after encoding all 64 symbols, the resulting number could be accurately represented in a (VB) Double type, the actual bit usage could be reduced to the 64 bits needed to store the Double type. In acuality, the 64 bits would be read as one 64 bit binary fraction, where the decimal point is assumed to be on the left side instead of the right, allowing the 64 bits to represent the encoded (fractional) value. The trouble with implementing this method is that the floating point math packages in computers today are not equipped to handle numbers with many significant digits. Some means of working on lengthy numbers is needed and improvements may include using Byte arrays to store each range value. For the compression buffs, this method offers better entropy than Huffman coding in that for a symbol that is repeated often, its 'optimal' bit size would be somewhat less than one. Huffman coding would probably assign one bit for the symbol, which could be 3 or 4 times more than the optimal value. The Arithmetic coding method gets closer to achieving the optimal bit sizes and thus better entropy. How you figure 'optimal' size and other such intracacies are beyond my experience, but you can read about them in 'The Data Compression Book' by Mark Nelson / M & T Books, (algorithms in ANSI C) from which the following code was adapted. Source Code: The demo below uses a form of arithmetic coding using floating point math on a Double Precision value. In its current state, it is only good for about 14 or 15 symbols, and sometimes less. On a new form, add a command button and 3 text boxes. Using the letters A-S (T is the End of Data symbol) enter a string into Text1 and click on the button.... Option Explicit Dim Model(0 To 20) As Double Private Sub Command1_Click() Dim Code As Double Code = Encode(CStr(Text1) & "T") 'Appending EOD Text2 = CStr(Code) Text3 = Decode(Code) End Sub Private Sub Form_Load() Dim i&, k# 'Store Ranges for A-S (and EOD) For i = 0 To 20 Model(i) = k k = k + 0.05 ' = 1/20 of entire (20 symbol) table Next i End Sub Private Function Encode(DataIn$) As Double Dim Hi As Double, Lo As Double, Range As Double Dim i As Byte, sym As Byte Hi = 1# Lo = 0 DataIn = UCase$(DataIn) For i = 1 To Len(DataIn) sym = Asc(Mid$(DataIn, i, 1)) - 64 'Trap input errors If BadInput(sym, i) Then Encode = 0.99999 Exit Function End If 'Symbol OK - Reset new range values Range = Hi - Lo Hi = Lo + (Range * Model(sym)) Lo = Lo + (Range * Model(sym - 1)) Next i Encode = Lo End Function Private Function Decode(Data As Double) As String Dim i As Byte, sym As Byte Do sym = FindSymbol(Data) If sym = 20 Then Exit Do 'End Of Data symbol 'Subtract out the Lo value and divide by the range size Data = (Data - Model(sym - 1)) / 0.05 Decode = Decode & Chr(sym + 64) Loop End Function Private Function FindSymbol(ByVal X As Double) As Byte 'Where does X fall on the range of 0 - .9999 ... FindSymbol = 1 Do While Model(FindSymbol) < X FindSymbol = FindSymbol + 1 Loop End Function Private Function BadInput(ByVal s, ByVal i) As Boolean If s > 20 Then MsgBox "Input error" 'Highlight offensive input! With Text1 .SelStart = i - 1 .SelLength = 1 .SetFocus End With BadInput = True End If End Function
This page took 0.02 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.