EddyHawk's Info List
---
Compression Algorithm:
----
Lossy
---
 JPEG
---
Lossless
---
RLE (Run Length Encoding)
Dictionary
 LZ family
 -the algo is called LZ (Lempel Ziv)
 -LZ77
  invented by Jakob Ziv, proposed by Abraham Lempel & Jakob Ziv on 1977, thus it's called LZ77
  unpatented
  .LZH (Lempel Ziv Hash)
   combination of LZ77 & hash variation
   a lot of patent
   -deflate (ZIP V2.0)
    by: Phil Kotz (PKWARE)
   -ARJ
    by: Robert K. Jung (ARJSoft)
    independently reinvented by Ross Williams as LZRW3-A
  .LZSS (Lempel Ziv Storer Szymanski)
   LZ77 modified by James Storer & Thomas Syzmanski on 1982
   LZSS plus other algos:
   -LZB (LZSS+Bell)
    plus simple coding for variable length
   -LZSS+Shannon-Fano
    used in ZIP V1.x
   -LZH (LZSS+Huffman)
    .LHA (LZ Huffman Archiver)
     by Haruyazu Yoshisaki (Japan)
    .AR002
     by Haruhiko Okumura (Japan)
    .UC
     by AIP-NL
    .early version of PKZIP & ARJ (?)
   -LZAri (LZSS+Arithmetic)
    by Haruhiko Okumura (Japan)
  .LZS (Lempel Ziv Stac)
   patented by Waterworth, owned by Stac, Inc. for disk presor prog called Stacker
   now is known as LZRW-1 (reinvented by Ross Williams) on 1991
   also reinvented by Gibson & Graybill
  .LZRW (Lempel Ziv Rose Williams)
   -by Rose Williams (Australia), the founder of comp.compression newsgroup
   -LZRW-2, LZRW-4
  .LZP (Lempel Ziv Predictor)
   by: Charles Bloom
   EdH: I never seen a successful implementation of LZP. No executable provided
    by the author and X1 V0.95 am5 (LZP algo) hangs my cpu. Why?
 -LZ78
  invented by Jakob Ziv, proposed by Abraham Lempel & Jakob Ziv on 1978, thus it's called LZ78
  .LZW (Lempel Ziv Welch)
   developed by David Welch on 1983 and proposed on 1984
   patented by IBM & UniSys
   used in GIF picture format
   -Dynamic LZW
    1 of algorithms used by PKZIP V2.04g
   -LZC
    algo used by Unix's Compress
    newer method uses LRU
  .LZMW (Lempel Ziv Miller Wegman)
   invented by Miller & Wagner (1984)
  .LZAP (Lempel Ziv All Prefixes)
   invented & patented by James Storer (1988)
   reinvented by Dan Bernstein (1990)
  .Y (Yabba or Yabbawhap)
   by D. J. Bernstein (1990)
   reinvented by Rose Williams as LZRW-5
 -LZFG (Lempel Ziv Fiala Greene)
  proposed by Fiala & Greene (1990)
  combination of LZ77 & LZ78
  also called 'LZ with Finite Windows' (?)
  patented (?)
 -lazy match
 -LZO (Lempel Ziv Oberhumer)
  by: Markus F.X.J. Oberhumer (Austria)
Statistical
 -Huffman
  .invented by D.A. Huffman (1952)
  .static Huffman
   2 pass
    read data to collect frequency distribution then reread data to
     compress
  .dynamic/adaptive Huffman
 -Shannon-Fano
 -Arithmetic
  proposed by P. Elias
  eats a lot of mem & cpu power -> not practical
  .DMC (Dynamic Markov Coding)
   unlimited mem & cpu power requirement, rarely used
   -Guazzo's
  .PPM (Prediction by Partial Matching)
   proposed by Cleary, Witten & W. J. Teahan
   -PPMC
    implementation by Allistair Moffat
   -PPM*
    Cleary, Witten & W. J. Teahan's improvement
   -PPMD
    by Bill Teahan
   -PPMZ
    Charles Bloom's improvement
 -Splay tree
  mediocre pres
  usable as cipher
Associative Coding
 by George Buyanovsky
 used in ACB archiver
Magic Function Theory
 compression functions proposed (mostly repetitively applied) which on 1st thought looked capable of incredible compression ratio & stay lossless at the same time, but actually is impossible
---
Burrows-Wheeler Transformation (BWT)
 -proposed by Michael Burrows & David Wheeler (1994)
 -invented by David Wheeler (1983)
 -not compression algorithm, but reversibly transform data to format suitable
  for compression
 -BWT is proposed to be followed by MTF (Move To Front) coding
 .BWT plus
  BWT+Shannon-Fano
  BWT+6/2 arith
  FBS (Fenwick's Block Sorting Transformation)
   by: Peter Fenwick
TAR
 used in Unix, combine multiple files into single file before compress
 TAR doesn't provide compression
 compressed data must be processed twice: first decompress it
  then unTAR to extract files in TAR-ed archive
Solid Archiving
 sees multiple (small/similar) files as single file before compress
 compressed data only be processed once, because the solid archiver 
 simultaneously decompress & extract files
---
MISC
---
-Too many compression algorithms are patented and often overlaps each others
 or comprise too wide algorithms
 .it try to patent mathematic algorithms while most court deals with source
  piracy
  it can't be settled properly in court
 .compressor author find difficulties to properly select & implement
  compression algorithms w/o getting sued
 .getting harder to invent & patent new algorithm
 .patent office doesn't have enough knowledge to examine patent requests
 .application author find difficulties on supporting standard, well-known
  but patented format
  ex: to use GIF you must pay to at least 30 parties!
-Peter Fenwick (New Zealand)
 write paper about 'Improved BWT Algorithm'
 write paper about 'Associative Coder' (?)