Concepedia

Publication | Open Access

Comprehensive ensemble in QSAR prediction for drug discovery

227

Citations

36

References

2019

Year

TLDR

QSAR links chemical structure to biological activity and is essential for drug discovery, yet its predictive power is limited, and ensemble machine‑learning methods have been used to increase model diversity, though most existing approaches restrict diversity to a single subject. The study proposes a comprehensive ensemble framework that builds diversified models across multiple subjects, integrates them via second‑level meta‑learning, and introduces an end‑to‑end.

Abstract

Abstract Background Quantitative structure-activity relationship (QSAR) is a computational modeling method for revealing relationships between structural properties of chemical compounds and biological activities. QSAR modeling is essential for drug discovery, but it has many constraints. Ensemble-based machine learning approaches have been used to overcome constraints and obtain reliable predictions. Ensemble learning builds a set of diversified models and combines them. However, the most prevalent approach random forest and other ensemble approaches in QSAR prediction limit their model diversity to a single subject. Results The proposed ensemble method consistently outperformed thirteen individual models on 19 bioassay datasets and demonstrated superiority over other ensemble approaches that are limited to a single subject. The comprehensive ensemble method is publicly available at http://data.snu.ac.kr/QSAR/ . Conclusions We propose a comprehensive ensemble method that builds multi-subject diversified models and combines them through second-level meta-learning. In addition, we propose an end-to-end neural network-based individual classifier that can automatically extract sequential features from a simplified molecular-input line-entry system (SMILES). The proposed individual models did not show impressive results as a single model, but it was considered the most important predictor when combined, according to the interpretation of the meta-learning.

References

YearCitations

Page 1