Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Number of positions in chess -- the rest of the story

Author: Ratko V Tomic

Date: 20:32:53 01/03/06

> Now, back to chess, could your idea be used to create
> extremely well compressed tablebase files?
>
> Or how about a stupendously well squished chess database?

I think it would superior for both the types of coding tasks to
Huffman or Arithmetic code. In addition to speed & compactness
edge it has, there is an additional advantage for the tablebase
or general data base records described in an earlier post here:

 http://www.talkchess.com/forums/1/message.html?476561

It is that the size of output is fixed for any given "enumerative
class" (such as all permutations with given number of elements,
or all trees with given number of leaf & internal nodes). That
output size is also accessible without decoding the data (the
Arithmetic & Huffman lack both of these properties). So one
can have many small compressed chunks, all concatenated (or
layed out in some structure or fields) in one larger block
and accessible randomly, without having to decode all the
stuff up to the target chunk.

Another advantage is in the "universal" or "descriptive" style
of modeling as opposed to "adaptive" or "predictive" style,
which is the current AC modeling "paradigm". This is covered
in detail in the report [T2]:

   "Quantized indexing: Background information"
   http://www.1stworks.com/ref/TR/tr05-0625a.pdf

last chapter (pp. 26-36). The effect of this difference is
illustrated in the performance table ([T3] p. 10), in the
last row called "Vary" where you can see that QI compresses
the input "Vary" to less than half of AC. The input "Vary"
can be anything which changes the symbol statistics
drastically. For conciseness of description the [T3] uses
simple int32 array { ...,-2,-1,0,+1,+2,....}, but anything
with that type of drastic change results in similar compression
difference. Essentially, the "descriptive" modeling used by QI
is never "suprised" very much by the input (since it doesn't
try to predict it but only to describe it), while AC's
"predictive" approach gets thrown off by the unpredictable.
For example, when coding using blocks of 1024 bits/block,
one can encode the count of 1's (or less frequent symbol)
in about 10 bits or in about 1/2 of that if probability
is known (estimated from previous blocks). With that count
the block with given number of ones is enumerated optimally.
The effects of "surprise" in sudden change of count of 1's
affect maninly the encoding of the count component, while
the generally larger part of output, the enumerative index,
is still coded optimally.

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.