Concepedia

Publication | Closed Access

Matchmaking: distributed resource management for high throughput computing

755

Citations

9

References

2002

Year

TLDR

Conventional resource management relies on a system model and centralized scheduler, but this paradigm struggles in distributed high‑throughput computing environments because of resource heterogeneity and distributed ownership, as exemplified by the widely used Condor system. The authors developed the classified advertisement (classad) matchmaking framework to provide a flexible, general resource‑management solution for distributed environments with decentralized ownership. The framework uses a semi‑structured data model that integrates schema, data, and query in a simple specification language, separates matching from claiming, and was engineered to address real deployment challenges in Condor. The resulting framework is robust, scalable, and flexible, and its matchmaking architecture underpins Condor’s robustness and efficiency.

Abstract

Conventional resource management systems use a system model to describe resources and a centralized scheduler to control their allocation. We argue that this paradigm does not adapt well to distributed systems, particularly those built to support high throughput computing. Obstacles include heterogeneity of resources, which make uniform allocation algorithms difficult to formulate, and distributed ownership, leading to widely varying allocation policies. Faced with these problems, we developed and implemented the classified advertisement (classad) matchmaking framework, a flexible and general approach to resource management in distributed environment with decentralized ownership of resources. Novel aspects of the framework include a semi structured data model that combines schema, data, and query in a simple but powerful specification language, and a clean separation of the matching and claiming phases of resource allocation. The representation and protocols result in a robust, scalable and flexible framework that can evolve with changing resources. The framework was designed to solve real problems encountered in the deployment of Condor, a high throughput computing system developed at the University of Wisconsin-Madison. Condor is heavily used by scientists at numerous sites around the world. It derives much of its robustness and efficiency from the matchmaking architecture.

References

YearCitations

Page 1