Concepedia

Abstract

Effective parallelization strategies are crucial for the performance of distributed deep neural network (DNN) training. Recently, several methods have been proposed to search parallelization strategies but they all optimize a single objective (e.g., execution time, memory consumption) and produce only one strategy. We propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Frontier Tracking</i> (FT), an efficient algorithm that finds <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">a set of Pareto-optimal parallelization strategies</i> to explore the best trade-off among different objectives. FT can minimize the memory consumption when the number of devices is limited and fully utilize additional resources to reduce the execution time. Based on <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FT</i> , we develop a user-friendly system, called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TensorOpt</i> , which allows users to run their distributed DNN training jobs without caring the details about searching and coding parallelization strategies. Experimental results show that TensorOpt is more flexible in adapting to resource availability compared with existing frameworks.

References

YearCitations

Page 1