Concepedia

TLDR

Availability is a highly desired but poorly engineered storage system property, and while mechanisms such as redundancy and failure recovery exist, configuration is typically left to managers who often lack the skills to balance trade‑offs, leading to static, suboptimal settings that become critical in distributed, wide‑area peer‑to‑peer storage infrastructures. This paper presents TotalRecall, a peer‑to‑peer storage system that automates availability management. TotalRecall automatically measures and estimates the availability of its host components, predicts future availability from past behavior, and calculates appropriate redundancy mechanisms and repair policies. By doing so, it delivers user‑specified availability while maximizing efficiency.

Abstract

Availability is a storage system property that is both highly desired and yet minimally engineered. While many systems provide mechanisms to improve availability - such as redundancy and failure recovery - how to best configure these mechanisms is typically left to the system manager. Unfortunately, few individuals have the skills to properly manage the trade-offs involved, let alone the time to adapt these decisions to changing conditions. Instead, most systems are configured statically and with only a cursory understanding of how the configuration will impact overall performance or availability. While this issue can be problematic even for individual storage arrays, it becomes increasingly important as systems are distributed - and absolutely critical for the wide-area peer-to-peer storage infrastructures being explored. This paper describes the motivation, architecture and implementation for a new peer-to-peer storage system, called TotalRecall, that automates the task of availability management. In particular, the TotalRecall system automatically measures and estimates the availability of its constituent host components, predicts their future availability based on past behavior, calculates the appropriate redundancy mechanisms and repair policies, and delivers user-specified availability while maximizing efficiency.

References

YearCitations

Page 1