Publication | Closed Access
Engineering the compression of massive tables: an experimental approach
59
Citations
16
References
2000
Year
Geometry CompressionMassive TablesEngineeringData OptimizationData ScienceData MiningVery Large DatabaseKnowledge DiscoveryComputer-aided DesignComputer ScienceDiscrete MathematicsLossless CompressionData CompressionComputational GeometryData ManagementNovel CompressionData-intensive ComputingCompression Size
We study the problem of compressing massive tables. We devise a novel compression paradigm—training for lossless compression— which assumes that the data exhibit dependencies that can be learned by examining a small amount of training material. We develop an experimental methodology to test the approach. Our result is a system, pzip, which outperforms gzip by factors of two in compression size and both compression and uncompression time for various tabular data. Pzip is now in production use in an AT&T network traffic data warehouse.
| Year | Citations | |
|---|---|---|
Page 1
Page 1