Concepedia

Publication | Open Access

A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility

173

Citations

28

References

2020

Year

TLDR

Accurate prediction of molecular properties such as lipophilicity and solubility is essential for rational compound design in the chemical and pharmaceutical industries. The authors develop a self‑attention‑based message‑passing neural network (SAMPN) to model the relationship between chemical structure and properties in an interpretable way. SAMPN uses a graph neural network with self‑attention that highlights each atom’s contribution to the property, and its Multi‑SAMPN variant jointly predicts multiple properties with higher accuracy and efficiency. SAMPN outperforms random forests and Deepchem’s MPN, offers interpretable atom‑level insights, and its code is freely available on GitHub.

Abstract

Abstract Efficient and accurate prediction of molecular properties, such as lipophilicity and solubility, is highly desirable for rational compound design in chemical and pharmaceutical industries. To this end, we build and apply a graph-neural-network framework called self-attention-based message-passing neural network (SAMPN) to study the relationship between chemical properties and structures in an interpretable way. The main advantages of SAMPN are that it directly uses chemical graphs and breaks the black-box mold of many machine/deep learning methods. Specifically, its attention mechanism indicates the degree to which each atom of the molecule contributes to the property of interest, and these results are easily visualized. Further, SAMPN outperforms random forests and the deep learning framework MPN from Deepchem. In addition, another formulation of SAMPN (Multi-SAMPN) can simultaneously predict multiple chemical properties with higher accuracy and efficiency than other models that predict one specific chemical property. Moreover, SAMPN can generate chemically visible and interpretable results, which can help researchers discover new pharmaceuticals and materials. The source code of the SAMPN prediction pipeline is freely available at Github ( https://github.com/tbwxmu/SAMPN ).

References

YearCitations

Page 1