Computer Chess Club Archives


Search

Terms

Messages

Subject: run-length-encoding question (slightly off-topic)

Author: martin fierz

Date: 22:22:43 03/26/02


aloha!

i am building the 8-piece checkers endgame database, and my generator will soon
be finished with it. now i'm working on a compression scheme, similar to what
was used in chinook, but not quite the same. here's approximately what i'm
doing: instead of saving a pair of (value,run-length), i save a pair (value,x),
and the run-length of the value is given by an array skip(x). example: if i
produce an array skip(32)=5,6,7,8,9,10,11,12,...36, i can encode runs of lengths
5 to 36 perfectly. (runs shorther than 5 are useless - i store them literally
instead of encoding them). however, when a long run comes along, let's say, 1000
bits, then i need about 30 bytes to encode that run. so i could for instance use
skip(32)=5,10,15,20,...160. then i would need only about 7 bytes to store a run
of 1000 bits. however, with this scheme, if i have a run of length 6, i have to
store it as 5+literal, so i have a problem there. when i thought about this, i
decided on something like
skip(x)=5,6,7,8,9,10,11,12,14,16,18,20,22,25,30,35,40,45,50,60,70,80,90,100,125,150,200,400,600,900,1200,2000.
i've forgotten the exact values i used, but it's not so important. the point
here is that short runs can be stored in one byte. anyway, after programming
this, the following question seems obvious: is there some kind of magical choice
of values for skip(x) which maximizes my database size for some number-theoretic
reason? also obvious, the very best skip(x) depends on the database (maybe it
has lots of runs with length y, and therefore it's a good idea to use y in the
skip array). of course i could run many different skip arrays and choose the one
which gives the best compression - this is impractical, because compressing the
8-piece db takes about one day, so i can't (or don't want to) test too many
choices, and am looking for a good starting choice.
these magic numbers would somehow have the property that you can express any
number within their range as a sum of very few of these numbers. some choices
which might be interesting are a binary sequence 5,6,7,8,16,32,64,128,256, etc.
or a fibonacci sequence 5,8,13,21,34,55,89 or a sequence where log(x) is evenly
spaced.
i'm sure i'm not the first person to wonder about this. this question might be
answered in knuth or a similar book. does anybody have a good idea about this?

cheers
  martin



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.