Publication | Closed Access
A note on the Ziv - Lempel model for compressing individual sequences (Corresp.)
80
Citations
5
References
1983
Year
EngineeringZiv-lempel Parsing TreeNatural Language ProcessingSyntaxString-searching AlgorithmData ScienceString ProcessingLanguage StudiesCoding TheoryLempel ModelApproximation TheoryLossless CompressionIndividual SequencesVariable-length CodeString MatchingComputer ScienceData CompressionZiv-lempel Compression AlgorithmLinguistics
The Ziv-Lempel compression algorithm is a string matching and parsing approach to data compression. The symbolwise equivalent for parsing models has been defined by Rissanen and Langdon and gives the same ideal codelength at the same cost in coding parameters. By describing the context and coding parameter for each symbol an insight is provided into how the Ziv-Lempel method achieves compression. This treatment does not employ a probabilistic source for the data string. The Ziv-Lempel method effectively counts symbol instances within parsed phrases. The coding parameter for each symbolwise context is determined by cumulative count ratios. The code string length increase for a symbol <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">y</tex> following substring <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">s</tex> , under the symbolwise equivalent, is the log of the ratio of node counts in subtrees <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">s</tex> and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">s\cdot y</tex> of the Ziv-Lempel parsing tree. To demonstrate the symbolwise equivalent of the Ziv-Lempel algorithm, we extend the work of Rissanen and Langdon to incomplete parse trees. The result requires the proper handling of the comma when one phrase is the prefix of another phrase.
| Year | Citations | |
|---|---|---|
Page 1
Page 1