Concepedia

Publication | Closed Access

Chronus

42

Citations

25

References

2021

Year

Abstract

Modern GPU clusters support Deep Learning training (DLT) jobs in a distributed manner. Job scheduling is the key to improve the training performance, resource utilization and fairness across users. Different training jobs may require various objectives and demands in terms of completion time. How to efficiently satisfy all these requirements is not extensively studied.

References

YearCitations

Page 1