On Model Parallelization and Scheduling Strategies for Distributed Machine Learning

TLDR

Distributed machine learning is usually data‑parallel, partitioning data across workers, whereas model parallelism—partitioning model parameters—poses distinct system, algorithmic, and theoretical challenges. The paper introduces STRADS, a system that schedules parameter updates in model‑parallel machine learning by exploiting evolving structural properties of models. STRADS implements a dynamic scheduling abstraction that selects parameter updates, enabling efficient model‑parallel algorithms for topic modeling, matrix factorization, and Lasso on distributed workers. Experiments show that STRADS achieves better memory efficiency and comparable or superior performance compared to existing implementations for topic modeling, matrix factorization, and Lasso.

Abstract

Distributed machine learning has typically been approached from a data parallel perspective, where big data are partitioned to multiple workers and an algorithm is executed concurrently over different data subsets under various synchronization schemes to ensure speed-up and/or correctness. A sibling problem that has received relatively less attention is how to ensure efficient and correct model parallel execution of ML algorithms, where parameters of an ML program are partitioned to different workers and undergone concurrent iterative updates. We argue that model and data parallelisms impose rather different challenges for system design, algorithmic adjustment, and theoretical analysis. In this paper, we develop a system for model-parallelism, STRADS, that provides a programming abstraction for scheduling parameter updates by discovering and leveraging changing structural properties of ML programs. STRADS enables a flexible tradeoff between scheduling efficiency and fidelity to intrinsic dependencies within the models, and improves memory efficiency of distributed ML. We demonstrate the efficacy of model-parallel algorithms implemented on STRADS versus popular implementations for topic modeling, matrix factorization, and Lasso.

References

Page 1

	Year	Citations

Page 1