Publication | Open Access
Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control
255
Citations
84
References
2022
Year
EngineeringMachine LearningData ScienceData MiningPattern RecognitionSparse ModelingSindy ModelsStatisticsNonlinear Time SeriesHigh-noise LimitPredictive AnalyticsKnowledge DiscoverySparse Model IdentificationInverse ProblemsComputer ScienceForecastingStatistical Learning TheoryActive LearningSparse IdentificationSparse RepresentationBootstrap ResamplingHigh-dimensional MethodCompressive SensingStatistical InferenceEnsemble Algorithm
Sparse model identification enables discovery of nonlinear dynamical systems from data, but it is highly sensitive to noise, especially when data are scarce. This work uses bootstrap aggregating to make the sparse identification of nonlinear dynamics (SINDy) algorithm more robust. An ensemble of SINDy models is built from random subsets of limited, noisy data; aggregate statistics yield inclusion probabilities for candidate functions, enabling uncertainty quantification, probabilistic forecasts, active learning, and improved model predictive control. The resulting ensemble‑SINDy (E‑SINDy) algorithm substantially improves accuracy and robustness, recovering partial differential equations from data with more than twice the noise of previous reports, learning Lotka–Volterra dynamics from sparse historical data, and matching standard SINDy’s computational scaling.
Sparse model identification enables the discovery of nonlinear dynamical systems purely from data; however, this approach is sensitive to noise, especially in the low-data limit. In this work, we leverage the statistical approach of bootstrap aggregating (bagging) to robustify the sparse identification of the nonlinear dynamics (SINDy) algorithm. First, an ensemble of SINDy models is identified from subsets of limited and noisy data. The aggregate model statistics are then used to produce inclusion probabilities of the candidate functions, which enables uncertainty quantification and probabilistic forecasts. We apply this ensemble-SINDy (E-SINDy) algorithm to several synthetic and real-world datasets and demonstrate substantial improvements to the accuracy and robustness of model discovery from extremely noisy and limited data. For example, E-SINDy uncovers partial differential equations models from data with more than twice as much measurement noise as has been previously reported. Similarly, E-SINDy learns the Lotka Volterra dynamics from remarkably limited data of yearly lynx and hare pelts collected from 1900 to 1920. E-SINDy is computationally efficient, with similar scaling as standard SINDy. Finally, we show that ensemble statistics from E-SINDy can be exploited for active learning and improved model predictive control.
| Year | Citations | |
|---|---|---|
Page 1
Page 1