Publication | Open Access
Stochastic Weight Averaging Revisited
19
Citations
19
References
2023
Year
Artificial IntelligenceParameter SpaceEngineeringMachine LearningStatistical AveragingMathematical StatisticData ScienceSparse Neural NetworkBackbone SgdStochastic Weight AveragingStatisticsNeural Scaling LawMachine Learning ModelNeural Network WeightsComputer EngineeringProbability TheoryComputer ScienceDeep LearningNeural Architecture SearchStochastic Optimization
Averaging neural network weights sampled by a backbone stochastic gradient descent (SGD) is a simple-yet-effective approach to assist the backbone SGD in finding better optima, in terms of generalization. From a statistical perspective, weight-averaging contributes to variance reduction. Recently, a well-established stochastic weight-averaging (SWA) method was proposed, which featured the application of a cyclical or high-constant (CHC) learning-rate schedule for generating weight samples for weight-averaging. Then, a new insight on weight-averaging was introduced, which stated that weight average assisted in discovering a wider optima and resulted in better generalization. We conducted extensive experimental studies concerning SWA, involving 12 modern deep neural network model architectures and 12 open-source image, graph, and text datasets as benchmarks. We disentangled the contributions of the weight-averaging operation and the CHC learning-rate schedule for SWA, showing that the weight-averaging operation in SWA still contributed to variance reduction, and the CHC learning-rate schedule assisted in exploring the parameter space more widely than the backbone SGD, which could be be under-fitted due to a lack of training budget. We then presented an algorithm termed periodic SWA (PSWA) that comprised a series of weight-averaging operations to exploit such wide parameter space structures as explored by the CHC learning-rate schedule, and we empirically demonstrated that PSWA outperformed its backbone SGD remarkably.
| Year | Citations | |
|---|---|---|
Page 1
Page 1