Publication | Closed Access
Probabilistic deduplication for cluster-based storage systems
37
Citations
19
References
2012
Year
Unknown Venue
Cluster ComputingEngineeringStorage ManagementData DeduplicationStorage StructureDistributed Deduplication TechniquesStorage SystemsData ScienceData MiningStateless StrategiesData IntegrationParallel ComputingBig DataData ManagementProbabilistic DeduplicationComputer EngineeringComputer ScienceCloud ComputingParallel ProgrammingDistributed Data StoreSingle-node Backup Systems
The need to backup huge quantities of data has led to the development of a number of distributed deduplication techniques that aim to reproduce the operation of centralized, single-node backup systems in a cluster-based environment. At one extreme, stateful solutions rely on indexing mechanisms to maximize deduplication. However the cost of these strategies in terms of computation and memory resources makes them unsuitable for large-scale storage systems. At the other extreme, stateless strategies store data blocks based only on their content, without taking into account previous placement decisions, thus reducing the cost but also the effectiveness of deduplication.
| Year | Citations | |
|---|---|---|
Page 1
Page 1