Concepedia

TLDR

This paper introduces a persistent data management layer that simplifies cluster‑based Internet service construction. The layer, called a distributed data structure (DDS), presents a conventional single‑site interface while partitioning and replicating data across a cluster, using a distributed hash table with two‑phase commits to provide a coherent view that any node can service. The DDS delivers incremental scaling, fault tolerance, high availability, concurrency, consistency, durability, and in a 128‑node, 1‑TB deployment achieves 61,432 read ops/s and 13,582 write ops/s, thereby simplifying service construction by offloading state management.

Abstract

This paper presents a new persistent data management layer designed to simplify cluster-based Internet service construction. This self-managing layer, called a distributed data structure (DDS), presents a conventional single-site data structure interface to service authors, but partitions and replicates the data across a cluster. We have designed and implemented a distributed hash table DDS that has properties necessary for Internet services (incremental scaling of throughput and data capacity, fault tolerance and high availability, high concurrency, consistency, and durability). The hash table uses two-phase commits to present a coherent view of its data across all cluster nodes, allowing any node to service any task. We show that the distributed hash table simplifies Internet service construction by decoupling service-specific logic from the complexities of persistent, consistent state management, and by allowing services to inherit the necessary service properties from the DDS rather than having to implement the properties themselves. We have scaled the hash table to a 128 node cluster, 1 terabyte of storage, and an in-core read throughput of 61,432 operations/s and write throughput of 13,582 operations/s.

References

YearCitations

Page 1