Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines

Yang, Boyi; Wright, Adam

Computer Science > Computation and Language

arXiv:1809.05814 (cs)

[Submitted on 16 Sep 2018]

Title:Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines

Authors:Boyi Yang, Adam Wright

View PDF

Abstract:Health professionals can use natural language processing (NLP) technologies when reviewing electronic health records (EHR). Machine learning free-text classifiers can help them identify problems and make critical decisions. We aim to develop deep learning neural network algorithms that identify EHR progress notes pertaining to diabetes and validate the algorithms at two institutions. The data used are 2,000 EHR progress notes retrieved from patients with diabetes and all notes were annotated manually as diabetic or non-diabetic. Several deep learning classifiers were developed, and their performances were evaluated with the area under the ROC curve (AUC). The convolutional neural network (CNN) model with a separable convolution layer accurately identified diabetes-related notes in the Brigham and Womens Hospital testing set with the highest AUC of 0.975. Deep learning classifiers can be used to identify EHR progress notes pertaining to diabetes. In particular, the CNN-based classifier can achieve a higher AUC than an SVM-based classifier.

Comments:	9 pages, 4 figures, submitted to Journal of the American Medical Informatics Association (JAMIA) on September 15th, 2018
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1809.05814 [cs.CL]
	(or arXiv:1809.05814v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1809.05814

Submission history

From: Boyi Yang [view email]
[v1] Sun, 16 Sep 2018 04:21:38 UTC (851 KB)

Computer Science > Computation and Language

Title:Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators