Concepedia

Publication | Open Access

Bridging Machine Learning and Thermodynamics for Accurate p <i>K</i> <sub>a</sub> Prediction

22

Citations

53

References

2024

Year

Abstract

Integrating scientific principles into machine learning models to enhance their predictive performance and generalizability is a central challenge in the development of AI for Science. Herein, we introduce Uni-p<i>K</i> <sub>a</sub>, a novel framework that successfully incorporates thermodynamic principles into machine learning modeling, achieving high-precision predictions of acid dissociation constants (p<i>K</i> <sub>a</sub>), a crucial task in the rational design of drugs and catalysts, as well as a modeling challenge in computational physical chemistry for small organic molecules. Uni-p<i>K</i> <sub>a</sub> utilizes a comprehensive free energy model to represent molecular protonation equilibria accurately. It features a structure enumerator that reconstructs molecular configurations from p<i>K</i> <sub>a</sub> data, coupled with a neural network that functions as a free energy predictor, ensuring high-throughput, data-driven prediction while preserving thermodynamic consistency. Employing a pretraining-finetuning strategy with both predicted and experimental p<i>K</i> <sub>a</sub> data, Uni-p<i>K</i> <sub>a</sub> not only achieves state-of-the-art accuracy in chemoinformatics but also shows comparable precision to quantum mechanics-based methods.

References

YearCitations

Page 1