research-article

Convolutional neural networks for biomedical text classification: application in indexing biomedical articles

Authors:

Ramakanth KavuluruAuthors Info & Claims

BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics

Pages 258 - 267

https://doi.org/10.1145/2808719.2808746

Published: 09 September 2015 Publication History

Abstract

Building high accuracy text classifiers is an important task in biomedicine given the wealth of information hidden in unstructured narratives such as research articles and clinical documents. Due to large feature spaces, traditionally, discriminative approaches such as logistic regression and support vector machines with n-gram and semantic features (e.g., named entities) have been used for text classification where additional performance gains are typically made through feature selection and ensemble approaches. In this paper, we demonstrate that a more direct approach using convolutional neural networks (CNNs) outperforms several traditional approaches in biomedical text classification with the specific use-case of assigning medical subject headings (or MeSH terms) to biomedical articles. Trained annotators at the national library of medicine (NLM) assign on an average 13 codes to each biomedical article, thus semantically indexing scientific literature to support NLM's PubMed search system. Recent evidence suggests that effective automated efforts for MeSH term assignment start with binary classifiers for each term. In this paper, we use CNNs to build binary text classifiers and achieve an absolute improvement of over 3% in macro F-score over a set of selected hard-to-classify MeSH terms when compared with the best prior results on a public dataset. Additional experiments on 50 high frequency terms in the dataset also show improvements with CNNs. Our results indicate the strong potential of CNNs in biomedical text classification tasks.

References

[1]

A. Aronson, J. Mork, C. Gay, S. Humphrey, and W. Rogers. The NLM indexing initiative's medical text indexer. In Proceedings of MEDINFO, 2004.

[2]

A. R. Aronson and F.-M. Lang. An overview of metamap: historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3):229--236, 2010.

[3]

Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin. A neural probabilistic language model. The Journal of Machine Learning Research, 3:1137--1155, 2003.

Digital Library

[4]

J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio. Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy), June 2010. Oral Presentation.

[5]

D. Blei and J. Lafferty. Correlated topic models. Advances in neural information processing systems, 18:147, 2006.

Digital Library

[6]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003.

Digital Library

[7]

D. D. A. Bui and Q. Zeng-Treitler. Learning regular expressions for clinical text classification. Journal of the American Medical Informatics Association, pages amiajnl--2013, 2014.

[8]

D. Cameron, R. Kavuluru, T. C. Rindflesch, A. P. Sheth, K. Thirunarayan, and O. Bodenreider. Context-driven automatic subgraph creation for literature-based discovery. Journal of biomedical informatics, 54:141--157, 2015.

Digital Library

[9]

R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, pages 160--167. ACM, 2008.

Digital Library

[10]

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12:2493--2537, 2011.

Digital Library

[11]

J. Deng, N. Ding, Y. Jia, A. Frome, K. Murphy, S. Bengio, Y. Li, H. Neven, and H. Adam. Large-scale object classification using label relation graphs. In Computer Vision--ECCV 2014, pages 48--64. Springer, 2014.

[12]

X. Glorot, A. Bordes, and Y. Bengio. Deep sparse rectifier networks. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume, volume 15, pages 315--323, 2011.

[13]

M. Huang, A. Névéol, and Z. Lu. Recommending mesh terms for annotating biomedical articles. Journal of the American Medical Informatics Association, 18(5):660--667, 2011.

[14]

A. Jimeno-Yepes, J. G. Mork, D. Demner-Fushman, and A. R. Aronson. A one-size-fits-all indexing method does not exist: Automatic selection based on meta-learning. Journal of Computing Science and Engineering, 6(2):151--160, 2012.

[15]

A. Jimeno Yepes, L. Plaza, J. Carrillo-de Albornoz, J. G. Mork, and A. R. Aronson. Feature engineering for medline citation categorization with mesh. BMC Bioinformatics, 16(1):113, 2015.

[16]

T. Joachims. A support vector method for multivariate performance measures. In Proceedings of the 22nd international conference on Machine learning, pages 377--384. ACM, 2005.

Digital Library

[17]

N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modeling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 655--665, Baltimore, Maryland, June 2014. Association for Computational Linguistics.

[18]

R. Kavuluru and Y. Lu. Leveraging output term co-occurrence frequencies and latent associations in predicting medical subject headings. Data & Knowledge Engineering, 94(Part B):189--201, 2014.

[19]

Y. Kim. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746--1751, Doha, Qatar, October 2014. Association for Computational Linguistics.

[20]

S. Kiritchenko, X. Zhu, and S. M. Mohammad. Sentiment analysis of short informal texts. Journal of Artificial Intelligence Research, pages 723--762, 2014.

Digital Library

[21]

K. Liu, J. Wu, S. Peng, C. Zhai, and S. Zhu. The fudan-uiuc participation in the BioASQ challenge task 2a: The antinomyra system. Proceedings of Question Answering Lab at the Conference and Labs of the Evaluation Forum (CLEF), 2014.

[22]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111--3119, 2013.

Digital Library

[23]

V. Nair and G. E. Hinton. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807--814, 2010.

Digital Library

[24]

J. Nam, J. Kim, E. L. Mencía, I. Gurevych, and J. Fürnkranz. Large-scale multi-label text classification - revisiting neural networks. In Machine Learning and Knowledge Discovery in Databases, pages 437--452. Springer, 2014.

Digital Library

[25]

J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, Q. V. Le, and A. Y. Ng. On optimization methods for deep learning. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 265--272, 2011.

Digital Library

[26]

J. Read, B. Pfahringer, G. Holmes, and E. Frank. Classifier chains for multi-label classification. Machine Learning, 85(3):335--359, 2011.

Digital Library

[27]

R. Socher. Recursive Deep Learning for Natural Language Processing and Computer Vision. PhD thesis, Department of Computer Science, Stanford University, 2014.

[28]

R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the conference on empirical methods in natural language processing (EMNLP), volume 1631, page 1642. Citeseer, 2013.

[29]

S. Sohn, W. Kim, D. C. Comeau, and W. J. Wilbur. Optimal training sets for bayesian prediction of MeSH assignment. Journal of the American Medical Informatics Association, 15(4):546--553, 2008.

[30]

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929--1958, 2014.

Digital Library

[31]

M. Steyvers, P. Smyth, M. Rosen-Zvi, and T. Griffiths. Probabilistic author-topic models for information discovery. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 306--315. ACM, 2004.

Digital Library

[32]

G. Tsatsaronis, G. Balikas, P. Malakasiotis, I. Partalas, M. Zschunke, M. R. Alvers, D. Weissenborn, A. Krithara, S. Petridis, D. Polychronopoulos, et al. An overview of the bioasq large-scale biomedical semantic indexing and question answering competition. BMC bioinformatics, 16(1):138, 2015.

[33]

A. J. Yepes and R. Berlanga. Knowledge based word-concept model estimation and refinement for biomedical text mining. Journal of biomedical informatics, 53:300--307, 2015.

[34]

A. J. Yepes, A. MacKinlay, J. Bedo, R. Garnavi, and Q. Chen. Deep belief networks and biomedical text categorisation. In Proceedings of the Twelfth Annual Workshop of the Australasia Language Technology Association, page 123, 2014.

[35]

A. J. Yepes, J. G. Mork, D. Demner-Fushman, and A. R. Aronson. Comparison and combination of several MeSH indexing approaches. In AMIA Annual Symposium Proceedings, volume 2013, page 709. American Medical Informatics Association, 2013.

[36]

M. Yetisgen-Yildiz and W. Pratt. The effect of feature representation on medline document classification. In Proceedings of AMIA Symposium, volume 2005, pages 849--853. American Medical Informatics Association, 2005.

[37]

M. D. Zeiler. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701, 2012.

[38]

M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus. Deconvolutional networks. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 2528--2535. IEEE, 2010.

[39]

M.-L. Zhang and Z.-H. Zhou. Multilabel neural networks with applications to functional genomics and text categorization. Knowledge and Data Engineering, IEEE Transactions on, 18(10):1338--1351, 2006.

Digital Library

[40]

Z.-H. Zhou. Ensemble methods: foundations and algorithms. CRC Press, 2012.

Digital Library

Cited By

Nunez JLeung BHo CNg RBates A(2024)Predicting which patients with cancer will see a psychiatrist or counsellor from their initial oncology consultation document using natural language processingCommunications Medicine10.1038/s43856-024-00495-x4:1Online publication date: 8-Apr-2024
https://doi.org/10.1038/s43856-024-00495-x
Wang BGao ZLin ZWang R(2023)A Disease-Prediction Protocol Integrating Triage Priority and BERT-Based Transfer Learning for Intelligent TriageBioengineering10.3390/bioengineering1004042010:4(420)Online publication date: 27-Mar-2023
https://doi.org/10.3390/bioengineering10040420
Nguyen HHuynh TMai NLe KThi-Ngoc-Diem P(2023)PhoBERTApplied Computer Systems10.2478/acss-2023-000428:1(35-43)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.2478/acss-2023-0004
Show More Cited By

Index Terms

Convolutional neural networks for biomedical text classification: application in indexing biomedical articles
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Text Classification with Topic-based Word Embedding and Convolutional Neural Networks
BCB '16: Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

Recently, distributed word embeddings trained by neural language models are commonly used for text classification with Convolutional Neural Networks (CNNs). In this paper, we propose a novel neural language model, Topic-based Skip-gram, to learn topic-...
Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
Convolutional Neural Networks for Web Documents Classification
Intelligent Information and Database Systems
Abstract
Web page classification is an important task in the fields of information retrieving and information filtering. Text classification is a well researched topic and recent state-of-the-art results involve deep learning algorithms. However, few works ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics

September 2015

683 pages

ISBN:9781450338530

DOI:10.1145/2808719

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGBio: ACM Special Interest Group on Bioinformatics

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 September 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Institutes of Health

Conference

BCB '15

Sponsor:

SIGBio

BCB '15: ACM International Conference on Bioinformatics, Computational Biology and Biomedicine

September 9 - 12, 2015

Georgia, Atlanta

Acceptance Rates

BCB '15 Paper Acceptance Rate 48 of 141 submissions, 34%;

Overall Acceptance Rate 254 of 885 submissions, 29%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

54
Total Citations
View Citations
763
Total Downloads

Downloads (Last 12 months)49
Downloads (Last 6 weeks)8

Reflects downloads up to 25 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Nunez JLeung BHo CNg RBates A(2024)Predicting which patients with cancer will see a psychiatrist or counsellor from their initial oncology consultation document using natural language processingCommunications Medicine10.1038/s43856-024-00495-x4:1Online publication date: 8-Apr-2024
https://doi.org/10.1038/s43856-024-00495-x
Wang BGao ZLin ZWang R(2023)A Disease-Prediction Protocol Integrating Triage Priority and BERT-Based Transfer Learning for Intelligent TriageBioengineering10.3390/bioengineering1004042010:4(420)Online publication date: 27-Mar-2023
https://doi.org/10.3390/bioengineering10040420
Nguyen HHuynh TMai NLe KThi-Ngoc-Diem P(2023)PhoBERTApplied Computer Systems10.2478/acss-2023-000428:1(35-43)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.2478/acss-2023-0004
Parwez MFazil MArif MNafis MAuwul M(2023)Biomedical Text Classification Using Augmented Word Representation Based on Distributional and Relational ContextsComputational Intelligence and Neuroscience10.1155/2023/29897912023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/2989791
Sworna ZIslam CBabar M(2023)APIRO: A Framework for Automated Security Tools API RecommendationACM Transactions on Software Engineering and Methodology10.1145/351276832:1(1-42)Online publication date: 13-Feb-2023
https://dl.acm.org/doi/10.1145/3512768
Cloutier NJapkowicz N(2023)Fine-tuned generative LLM oversampling can improve performance over traditional techniques on multiclass imbalanced text classification2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386772(5181-5186)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigData59044.2023.10386772
Anagun YBolel NIsik SOzkan S(2023)DEEP LEARNING-BASED CUSTOMER COMPLAINT MANAGEMENTJournal of Organizational Computing and Electronic Commerce10.1080/10919392.2023.221004932:3-4(217-231)Online publication date: 6-Jun-2023
https://doi.org/10.1080/10919392.2023.2210049
Cai LLi JLv HLiu WNiu HWang Z(2023)Integrating domain knowledge for biomedical text analysis into deep learning: A surveyJournal of Biomedical Informatics10.1016/j.jbi.2023.104418143(104418)Online publication date: Jul-2023
https://doi.org/10.1016/j.jbi.2023.104418
Flores CVerschae R(2023)Combining Regular Expressions and Supervised Algorithms for Clinical Text ClassificationIntelligent Data Engineering and Automated Learning – IDEAL 202310.1007/978-3-031-48232-8_35(381-392)Online publication date: 15-Nov-2023
https://doi.org/10.1007/978-3-031-48232-8_35
Baker MMohammed EJihad K(2023)Prediction of Colon Cancer Related Tweets Using Deep Learning ModelsIntelligent Systems Design and Applications10.1007/978-3-031-27440-4_50(522-532)Online publication date: 31-May-2023
https://doi.org/10.1007/978-3-031-27440-4_50
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents