Concepedia

TLDR

User‑level threading and tasking models have been developed to reduce OS‑thread overhead, yet existing solutions are either too application‑specific or lack sufficient flexibility and power. This paper introduces Argobots, a lightweight, low‑level framework intended as a portable, high‑performance substrate for high‑level programming models. Argobots implements a balanced execution model that offers general functionality with rich controls for specialization, and it is integrated with OpenMP, MPI, and colocated I/O services. Evaluations show that Argobots matches simpler generic runtimes, improves OpenMP interoperability, lowers MPI synchronization costs and latency, and enables colocated I/O with performance comparable to Pthreads while reducing interference.

Abstract

In the past few decades, a number of user-level threading and tasking models have been proposed in the literature to address the shortcomings of OS-level threads, primarily with respect to cost and flexibility. Current state-of-the-art user-level threading and tasking models, however, either are too specific to applications or architectures or are not as powerful or flexible. In this paper, we present Argobots, a lightweight, low-level threading and tasking framework that is designed as a portable and performant substrate for high-level programming models or runtime systems. Argobots offers a carefully designed execution model that balances generality of functionality with providing a rich set of controls to allow specialization by end users or high-level programming models. We describe the design, implementation, and performance characterization of Argobots and present integrations with three high-level models: OpenMP, MPI, and colocated I/O services. Evaluations show that (1) Argobots, while providing richer capabilities, is competitive with existing simpler generic threading runtimes; (2) our OpenMP runtime offers more efficient interoperability capabilities than production OpenMP runtimes do; (3) when MPI interoperates with Argobots instead of Pthreads, it enjoys reduced synchronization costs and better latency-hiding capabilities; and (4) I/O services with Argobots reduce interference with colocated applications while achieving performance competitive with that of a Pthreads approach.

References

YearCitations

Page 1