Publication | Closed Access
Managing Scalability in Object Storage Systems for HPC Linux Clusters.
27
Citations
8
References
2004
Year
Cluster ComputingStorage PerformanceEngineeringStorage ManagementComputer ArchitectureParallel StorageStorage StructureObject Storage DevicesStorage SystemsComputing SystemsStorage System “Parallel ComputingData ManagementFile SystemsComputer EngineeringComputer ScienceDistributed Data StorageObject Storage SystemsScalable Storage SystemsOperating SystemsStorage AssignmentParallel Programming
This paper describes the performance and manageability of scalable storage systems based on Object Storage Devices (OSD). Object-based storage was invented to provide scalable performance as the storage cluster scales in size. For example, in our large file tests a 10-OSD system provided 325 MB/sec read bandwidth to 5 clients (from disk), and a 299-OSD system provided 10,334 MB/sec read bandwidth to 151 clients. This shows linear scaling of 30x speedup with 30x more client demand and 30x more storage resources. However, the system must not become more difficult to manage as it grows. Otherwise, the performance benefits can be quickly overshadowed by the administrative burden of managing the system. Instead, the storage cluster must feel like a single system image from the management perspective, even though it may be internally composed of 10’s, 100’s or thousands of object storage devices. For the HPC market, which is characterized as having unusually large clusters with usually small IT budgets, it is important that the storage system “just work” with relatively little administrative overhead. 1. Scale Out, not Scale Up The high-performance computing (HPC) sector has often driven the development of new computing architectures, and has given impetus to the development of the Object Storage Architecture. The new architecture driving change today is the Linux cluster system, which is revolutionizing scientific, technical, and business computing. The invention of Beowulf clustering and the development of the Message Passing Interface (MPI) middleware allowed racks of commodity Intel PC-based systems running the Linux operating system to emulate most of the functionality of monolithic Symmetric MultiProcessing (SMP) systems. Since this can be done at less than 10% the cost of the highly-specialized, shared memory systems, the cost of scientific research dropped dramatically. Linux clusters are now the dominant computing architecture for scientific computing, and are quickly gaining traction in technical computing environments as well.
| Year | Citations | |
|---|---|---|
Page 1
Page 1