Publication | Closed Access
Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning
133
Citations
27
References
2020
Year
Unknown Venue
Cluster ComputingHeterogeneous Gpu ClustersHeterogeneous ComputingEngineeringMachine LearningGpu BenchmarkingComputer ArchitectureGpu ComputingData ScienceParallel ComputingComputer ScienceDeep Learning TrainingDeep LearningGpu ClusterPresent GandivafairGpu ArchitectureFair Share SchedulerCloud ComputingParallel ProgrammingGpu Virtualization
We present Gandivafair, a distributed, fair share scheduler that balances conflicting goals of efficiency and fairness in GPU clusters for deep learning training (DLT). Gandivafair provides performance isolation between users, enabling multiple users to share a single cluster, thus, maximizing cluster efficiency. Gandivafair is the first scheduler that allocates cluster-wide GPU time fairly among active users.
| Year | Citations | |
|---|---|---|
Page 1
Page 1