Evaluation of Assay Central Machine Learning Models for Rat Acute Oral Toxicity Prediction

Abstract

Acute rat oral toxicity is important in understanding hazard identification and drug risk management. This toxicity is often measured by 50% lethal dose (LD50), the amount of chemical that is expected to cause death in 50% of treated animals in a period of time. These costly and time-consuming LD50 studies use large numbers of animals, and it is crucial that scientists generate alternative methodologies. Therefore, there have been several efforts over the years to develop computational approaches that leverage the accumulated published LD50 data. In this study, LD50 and categorical data collected by NTP Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM) and the EPA’s National Center for Computational Toxicology (NCCT) were used with proprietary Assay Central software to create a suite of classification models of rat oral acute toxicity. Receiver operator characteristic (ROC) plots were used to evaluate each model’s predictive ability in conjunction with other statistics. Models were also generated with additional machine learning methods such as random forest, k-Nearest Neighbors, support vector classification, naïve Bayesian, AdaBoosted decision trees, and deep learning methods. In summary, all 5-fold cross validation statistics for the eight Bayesian models generated had ROC values >0.80. External validation using data provided by NICEATM and NCCT resulted in similarly high ROC values >0.82, whereas a second validation set had ROC values from 0.69 to 0.89. Comparison of different algorithms demonstrated that Assay Central (using a Bayesian algorithm) performed similarly to deep learning and k-Nearest Neighbors and indicates the potential utility of these various machine learning methods to predict rat acute oral toxicity from molecular structures alone.

References

Page 1

	Year	Citations

Page 1