Concepedia

TLDR

Performance isolation and differentiation among workloads sharing a storage infrastructure is essential, yet current tools rely on detailed provisioning that is slow and impractical for dynamic systems. The authors propose a software‑only solution that guarantees predictable storage access performance without requiring detailed system knowledge. The solution employs an online feedback loop with an adaptive, distributed controller that throttles requests to share throughput among workloads according to performance goals and importance, treating the system as a black box. Evaluation of the prototype, Triage, shows effective workload isolation and differentiation in an overloaded cluster file‑system with changing workloads and components.

Abstract

Ensuring performance isolation and differentiation among workloads that share a storage infrastructure is a basic requirement in consolidated data centers. Existing management tools rely on resource provisioning to meet performance goals; they require detailed knowledge of the system characteristics and the workloads. Provisioning is inherently slow to react to system and workload dynamics and, in the general case, it is not practical to provision for the worst case.We propose a software-only solution that ensures predictable performance for storage access. It is applicable to a wide range of storage systems and makes no assumptions about workload characteristics. We use an online feedback loop with an adaptive controller that throttles storage access requests to ensure that the available system throughput is shared among workloads according to their performance goals and their relative importance. The controller considers the system as a “black box” and adapts automatically to system and workload changes. The controller is distributed to ensure high availability under overload conditions, and it can be used for both block and file access protocols. The evaluation of Triage , our experimental prototype, demonstrates workload isolation and differentiation in an overloaded cluster file-system where workloads and system components are changing.

References

YearCitations

Page 1