Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: run-length-encoding question (slightly off-topic)

Author: martin fierz

Date: 20:39:35 03/27/02

Go up one level in this thread


On March 27, 2002 at 21:45:42, Eugene Nalimov wrote:

>Under x86 Windows you can get 3Gb of address space, not 2Gb.

i thought you get 2GB, and if you set some kind of switch in an obscure file you
can get 3GB? and i also thought that was only possible under win2000? or am i
mixing things up?

>And out of curiosity: had you tried to compress with DATACOMP.EXE?

no. the problem is that i don't compress the original file, so i can't compare
with datacomp. schaeffer introduced this trick to let the values of positions
with captures for either side "float" to the "dominant" value in a database,
i.e. to the value which occurs most often. this works in checkers, because
captures are mandatory, and therefore, when you access your db and you see a
capture position, you don't do a lookup but just finish a quiescence search. it
can't explode, because you have to capture when you can.
my compression program takes the original file, and first floats the values,
then compresses. it doesnt produce any intermediate output which i could run
datacomp on. if you are really interested, i can produce such a file and do the
experiment.

aloha
  martin

>Eugene




>On March 27, 2002 at 20:27:10, martin fierz wrote:
>
>>On March 27, 2002 at 17:05:09, Dan Andersson wrote:
>>
>>>It might be so. But huffman decoding can be speeded up quite a bit by using a
>>>tad more memory for a decoding cache and lookup table. As long as they fit in
>>>the CPU cache it might be possible. But if it's feasible or not depend on A) the
>>>different compression ratios B) the penalty for memory accesses. Where A will
>>>depend on the nature of the data. And B will depend hevily on A. Modern CPUs
>>>have far outstripped the ability of main memory to supply them with data, and
>>>HDs are dog slow. Therefore what was true a few years ago must be re-examined. A
>>>hybrid scheme might be envisioned. (All depending on the data compressed.) Where
>>>the HD files might use a heavier compression. (There are systems that manage to
>>>compile source code (or intermediate representations) and start the programs
>>>faster than it could load a pre compiled program. Due to the greater size of the
>>>executable)
>>>And the gain in loading time could be used to do a decompress to RLE or even
>>>Huff code in memory. And the resulting memory block might then be further
>>>decoded in the cache due to the possible gain due to the smaller amout of memory
>>>transferred. Such a multi tiered approach might be possible, and maybe even able
>>>to supply the CPU with the data it needs. But the tuning and scheduling of it
>>>might be a nightmare. Or a pure joy. This is all fever driven speculations from
>>>me. And I know just enough compression theory to be dangerous :) But the main
>>>point is that algorithmic approaces that were unfeasible a few years ago. Might
>>>fit better in todays  or tomorrows architectures. Boy, this is interesting!
>>>
>>>MvH Dan Andersson
>>
>>you definitely have a point here. not only because of everything you say, but
>>also because the 8-piece database, compressed by the chinook team, is something
>>like 5GB in size. i haven't quite finished the 8-piece database, but my 6-piece
>>db is about 10% smaller than theirs. i also have a small improvement to my
>>algorithm which could give some more compression. anyway, the point is of course
>>that even if it's "only" 4GB in the end, you can never load it into memory, and
>>always hit the harddisk. under windows, you can get 2GB, but no more, so if you
>>could fit the whole thing in 2GB, you might be able to access it much faster
>>even though decompression takes much longer CPU-wise - as long as you don't hit
>>the disk it wouldn't really matter.
>>unfortunately, i'm preparing for a world championship challenger tournament, and
>>don't have too much time to look deeper into this :-( and will stick with the
>>simple RLE. i also just saw that the chinook guys are publishing their 8-piece
>>db finally, so maybe i'll just throw it all away...
>>
>>aloha
>>  martin



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.