Deep Learning Based Regression and Multi-class Models for Acute Oral Toxicity Prediction with Automatic Chemical Feature Extraction

Xu, Youjun; Pei, Jianfeng; Lai, Luhua

Statistics > Machine Learning

arXiv:1704.04718 (stat)

[Submitted on 16 Apr 2017 (v1), last revised 4 May 2017 (this version, v3)]

Title:Deep Learning Based Regression and Multi-class Models for Acute Oral Toxicity Prediction with Automatic Chemical Feature Extraction

Authors:Youjun Xu, Jianfeng Pei, Luhua Lai

View PDF

Abstract:For quantitative structure-property relationship (QSPR) studies in chemoinformatics, it is important to get interpretable relationship between chemical properties and chemical features. However, the predictive power and interpretability of QSPR models are usually two different objectives that are difficult to achieve simultaneously. A deep learning architecture using molecular graph encoding convolutional neural networks (MGE-CNN) provided a universal strategy to construct interpretable QSPR models with high predictive power. Instead of using application-specific preset molecular descriptors or fingerprints, the models can be resolved using raw and pertinent features without manual intervention or selection. In this study, we developed acute oral toxicity (AOT) models of compounds using the MGE-CNN architecture as a case study. Three types of high-level predictive models: regression model (deepAOT-R), multi-classification model (deepAOT-C) and multi-task model (deepAOT-CR) for AOT evaluation were constructed. These models highly outperformed previously reported models. For the two external datasets containing 1673 (test set I) and 375 (test set II) compounds, the R2 and mean absolute error (MAE) of deepAOT-R on the test set I were 0.864 and 0.195, and the prediction accuracy of deepAOT-C was 95.5% and 96.3% on the test set I and II, respectively. The two external prediction accuracy of deepAOT-CR is 95.0% and 94.1%, while the R2 and MAE are 0.861 and 0.204 for test set I, respectively.

Comments:	36 pages, 4 figures
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
Cite as:	arXiv:1704.04718 [stat.ML]
	(or arXiv:1704.04718v3 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1704.04718

Submission history

From: Xu Youjun Xu Youjun [view email]
[v1] Sun, 16 Apr 2017 04:17:32 UTC (1,977 KB)
[v2] Wed, 26 Apr 2017 02:10:10 UTC (1,976 KB)
[v3] Thu, 4 May 2017 09:52:38 UTC (1,978 KB)

Statistics > Machine Learning

Title:Deep Learning Based Regression and Multi-class Models for Acute Oral Toxicity Prediction with Automatic Chemical Feature Extraction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Deep Learning Based Regression and Multi-class Models for Acute Oral Toxicity Prediction with Automatic Chemical Feature Extraction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators