Publication | Open Access
FedAT
120
Citations
17
References
2021
Year
Unknown Venue
Cluster ComputingDecentralized Machine LearningEngineeringTradeoff SpaceCollaborative LearningCommunication BottleneckFederated LearningCloud ComputingComputer ArchitectureFederated StructureDistributed SystemsComputer ScienceDistributed LearningParallel ProgrammingParallel ComputingDistributed ModelDistributed Data Analytics
Federated learning (FL) involves training a model over massive distributed devices, while keeping the training data localized and private. This form of collaborative learning exposes new tradeoffs among model convergence speed, model accuracy, balance across clients, and communication cost, with new challenges including: (1) straggler problem---where clients lag due to data or (computing and network) resource heterogeneity, and (2) communication bottleneck---where a large number of clients communicate their local updates to a central server and bottleneck the server. Many existing FL methods focus on optimizing along only one single dimension of the tradeoff space. Existing solutions use asynchronous model updating or tiering-based, synchronous mechanisms to tackle the straggler problem. However, asynchronous methods can easily create a communication bottleneck, while tiering may introduce biases that favor faster tiers with shorter response latencies.
| Year | Citations | |
|---|---|---|
Page 1
Page 1