Publication | Closed Access
Descriptors representing two- and three-body atomic distributions and their effects on the accuracy of machine-learned inter-atomic potentials
222
Citations
46
References
2020
Year
EngineeringMachine LearningMany-body Quantum PhysicComputational ChemistryEnergy MinimizationData SciencePhysic Aware Machine LearningMulti-task LearningThree-body Atomic DistributionsDistribution FunctionPrincipal Component AnalysisBiophysicsMachine-learning ModelsQuantum SciencePhysicsAtomic PhysicsQuantum ChemistryDeep LearningAb-initio MethodNatural SciencesMolecular PropertyApplied PhysicsMachine-learned Inter-atomic PotentialsMany-body Problem
When determining machine-learning models for inter-atomic potentials, the potential energy surface is often described as a non-linear function of descriptors representing two- and three-body atomic distribution functions. It is not obvious how the choice of the descriptors affects the efficiency of the training and the accuracy of the final machine-learned model. In this work, we formulate an efficient method to calculate descriptors that can separately represent two- and three-body atomic distribution functions, and we examine the effects of including only two- or three-body descriptors, as well as including both, in the regression model. Our study indicates that non-linear mixing of two- and three-body descriptors is essential for an efficient training and a high accuracy of the final machine-learned model. The efficiency can be further improved by weighting the two-body descriptors more strongly. We furthermore examine a sparsification of the three-body descriptors. The three-body descriptors usually provide redundant representations of the atomistic structure, and the number of descriptors can be significantly reduced without loss of accuracy by applying an automatic sparsification using a principal component analysis. Visualization of the reduced descriptors using three-body distribution functions in real-space indicates that the sparsification automatically removes the components that are less significant for describing the distribution function.
| Year | Citations | |
|---|---|---|
Page 1
Page 1