Publication | Closed Access
DZip: improved general-purpose loss less compression based on novel neural network modeling
39
Citations
26
References
2021
Year
Unknown Venue
Lossy CompressionEngineeringMachine LearningData ScienceModel CompressionPattern RecognitionSparse Neural NetworkComputer EngineeringNovel Neural NetworkSpecialized CompressorsComputer ScienceDeep LearningData CompressionLossless CompressionArithmetic Coding
We consider lossless compression based on statistical data modeling followed by prediction-based encoding, where an accurate statistical model for the input data leads to substantial improvements in compression. We propose DZip, a general-purpose compressor for sequential data that exploits the well-known modeling capabilities of neural networks (NNs) for prediction, followed by arithmetic coding. DZip uses a novel hybrid architecture based on adaptive and semi-adaptive training. Unlike most NN-based compressors, DZip does not require additional training data and is not restricted to specific data types. The proposed compressor outperforms general-purpose compressors such as Gzip (29% size reduction on average) and 7zip (12% size reduction on average) on a variety of real datasets, achieves near-optimal compression on synthetic datasets, and performs close to specialized compressors for large sequence lengths, without any human input. While the main limitation of NN-based compressors is generally the encoding/decoding speed, we empirically demonstrate that DZip achieves comparable compression ratio to other NN-based compressors while being several times faster. The source code for DZip and links to the datasets are available at https://github.com/mohit1997/Dzip-torch.
| Year | Citations | |
|---|---|---|
Page 1
Page 1