Concepedia

Publication | Closed Access

Glacier: highly durable, decentralized storage despite massive correlated failures

261

Citations

34

References

2005

Year

TLDR

Decentralized storage aggregates disk space across many computers and relies on redundancy, but existing systems assume independent failures or use costly introspection, while in practice failures are correlated and can be amplified by malicious worms. The authors aim to build Glacier, a distributed storage system that can deliver highly durable storage even in the face of large‑scale correlated failures. Glacier minimizes redundancy costs by using erasure coding, garbage collection, small‑object aggregation, and a loosely coupled maintenance protocol for redundant fragments. Glacier achieves six‑nines durability even when 60 % of nodes fail, with an eleven‑fold storage overhead and only four messages per node per minute, and it serves as the storage layer for an experimental serverless email system.

Abstract

Decentralized storage systems aggregate the available disk space of participating computers to provide a large storage facility. These systems rely on data redundancy to ensure durable storage despite of node failures. However, existing systems either assume independent node failures, or they rely on introspection to carefully place redundant data on nodes with low expected failure correlation. Unfortunately, node failures are not independent in practice and constructing an accurate failure model is difficult in large-scale systems. At the same time, malicious worms that propagate through the Internet pose a real threat of large-scale correlated failures. Such rare but potentially catastrophic failures must be considered when attempting to provide highly durable storage.In this paper, we describe Glacier, a distributed storage system that relies on massive redundancy to mask the effect of large-scale correlated failures. Glacier is designed to aggressively minimize the cost of this redundancy in space and time: Erasure coding and garbage collection reduces the storage cost; aggregation of small objects and a loosely coupled maintenance protocol for redundant fragments minimizes the messaging cost. In one configuration, for instance, our system can provide six-nines durable storage despite correlated failures of up to 60% of the storage nodes, at the cost of an elevenfold storage overhead and an average messaging overhead of only 4 messages per node and minute during normal operation. Glacier is used as the storage layer for an experimental serverless email system.

References

YearCitations

Page 1