Publication | Closed Access
CORE: Cross-object redundancy for efficient data repair in storage systems
27
Citations
32
References
2013
Year
Unknown Venue
Distributed File SystemCluster ComputingStorage PerformanceEngineeringStorage ManagementComputer ArchitectureData DeduplicationStorage StructureHigh Fault-toleranceCross-object RedundancyStorage SystemsData ScienceData IntegrationParallel ComputingErasure CodesData ManagementTraditional Erasure CodesComputer EngineeringComputer ScienceStorage VirtualizationEdge ComputingCloud ComputingParallel ProgrammingDistributed Data StoreIn-storage Computing
Erasure codes are an integral part of many distributed storage systems aimed at Big Data, since they provide high fault-tolerance for low overheads. However, traditional erasure codes are inefficient on replenishing lost data (vital for long term resilience) and on reading stored data in degraded environments (when nodes might be unavailable). Consequently, novel codes optimized to cope with distributed storage system nuances are vigorously being researched. In this paper, we take an engineering alternative, exploring the use of simple and mature techniques - juxtaposing a standard erasure code with RAID-4 like parity to realize cross object redundancy (CORE), and integrate it with HDFS. We benchmark the implementation in a proprietary cluster and in EC2. Our experiments show that for an extra 20% storage overhead (compared to traditional erasure codes) CORE yields up to 58% saving in bandwidth and is up to 76% faster while recovering a single failed node. The gains are respectively 16% and 64% for double node failures.
| Year | Citations | |
|---|---|---|
Page 1
Page 1