Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2808719.2808746acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Convolutional neural networks for biomedical text classification: application in indexing biomedical articles

Published: 09 September 2015 Publication History

Abstract

Building high accuracy text classifiers is an important task in biomedicine given the wealth of information hidden in unstructured narratives such as research articles and clinical documents. Due to large feature spaces, traditionally, discriminative approaches such as logistic regression and support vector machines with n-gram and semantic features (e.g., named entities) have been used for text classification where additional performance gains are typically made through feature selection and ensemble approaches. In this paper, we demonstrate that a more direct approach using convolutional neural networks (CNNs) outperforms several traditional approaches in biomedical text classification with the specific use-case of assigning medical subject headings (or MeSH terms) to biomedical articles. Trained annotators at the national library of medicine (NLM) assign on an average 13 codes to each biomedical article, thus semantically indexing scientific literature to support NLM's PubMed search system. Recent evidence suggests that effective automated efforts for MeSH term assignment start with binary classifiers for each term. In this paper, we use CNNs to build binary text classifiers and achieve an absolute improvement of over 3% in macro F-score over a set of selected hard-to-classify MeSH terms when compared with the best prior results on a public dataset. Additional experiments on 50 high frequency terms in the dataset also show improvements with CNNs. Our results indicate the strong potential of CNNs in biomedical text classification tasks.

References

[1]
A. Aronson, J. Mork, C. Gay, S. Humphrey, and W. Rogers. The NLM indexing initiative's medical text indexer. In Proceedings of MEDINFO, 2004.
[2]
A. R. Aronson and F.-M. Lang. An overview of metamap: historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3):229--236, 2010.
[3]
Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin. A neural probabilistic language model. The Journal of Machine Learning Research, 3:1137--1155, 2003.
[4]
J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio. Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy), June 2010. Oral Presentation.
[5]
D. Blei and J. Lafferty. Correlated topic models. Advances in neural information processing systems, 18:147, 2006.
[6]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003.
[7]
D. D. A. Bui and Q. Zeng-Treitler. Learning regular expressions for clinical text classification. Journal of the American Medical Informatics Association, pages amiajnl--2013, 2014.
[8]
D. Cameron, R. Kavuluru, T. C. Rindflesch, A. P. Sheth, K. Thirunarayan, and O. Bodenreider. Context-driven automatic subgraph creation for literature-based discovery. Journal of biomedical informatics, 54:141--157, 2015.
[9]
R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, pages 160--167. ACM, 2008.
[10]
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12:2493--2537, 2011.
[11]
J. Deng, N. Ding, Y. Jia, A. Frome, K. Murphy, S. Bengio, Y. Li, H. Neven, and H. Adam. Large-scale object classification using label relation graphs. In Computer Vision--ECCV 2014, pages 48--64. Springer, 2014.
[12]
X. Glorot, A. Bordes, and Y. Bengio. Deep sparse rectifier networks. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume, volume 15, pages 315--323, 2011.
[13]
M. Huang, A. Névéol, and Z. Lu. Recommending mesh terms for annotating biomedical articles. Journal of the American Medical Informatics Association, 18(5):660--667, 2011.
[14]
A. Jimeno-Yepes, J. G. Mork, D. Demner-Fushman, and A. R. Aronson. A one-size-fits-all indexing method does not exist: Automatic selection based on meta-learning. Journal of Computing Science and Engineering, 6(2):151--160, 2012.
[15]
A. Jimeno Yepes, L. Plaza, J. Carrillo-de Albornoz, J. G. Mork, and A. R. Aronson. Feature engineering for medline citation categorization with mesh. BMC Bioinformatics, 16(1):113, 2015.
[16]
T. Joachims. A support vector method for multivariate performance measures. In Proceedings of the 22nd international conference on Machine learning, pages 377--384. ACM, 2005.
[17]
N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modeling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 655--665, Baltimore, Maryland, June 2014. Association for Computational Linguistics.
[18]
R. Kavuluru and Y. Lu. Leveraging output term co-occurrence frequencies and latent associations in predicting medical subject headings. Data & Knowledge Engineering, 94(Part B):189--201, 2014.
[19]
Y. Kim. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746--1751, Doha, Qatar, October 2014. Association for Computational Linguistics.
[20]
S. Kiritchenko, X. Zhu, and S. M. Mohammad. Sentiment analysis of short informal texts. Journal of Artificial Intelligence Research, pages 723--762, 2014.
[21]
K. Liu, J. Wu, S. Peng, C. Zhai, and S. Zhu. The fudan-uiuc participation in the BioASQ challenge task 2a: The antinomyra system. Proceedings of Question Answering Lab at the Conference and Labs of the Evaluation Forum (CLEF), 2014.
[22]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111--3119, 2013.
[23]
V. Nair and G. E. Hinton. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807--814, 2010.
[24]
J. Nam, J. Kim, E. L. Mencía, I. Gurevych, and J. Fürnkranz. Large-scale multi-label text classification - revisiting neural networks. In Machine Learning and Knowledge Discovery in Databases, pages 437--452. Springer, 2014.
[25]
J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, Q. V. Le, and A. Y. Ng. On optimization methods for deep learning. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 265--272, 2011.
[26]
J. Read, B. Pfahringer, G. Holmes, and E. Frank. Classifier chains for multi-label classification. Machine Learning, 85(3):335--359, 2011.
[27]
R. Socher. Recursive Deep Learning for Natural Language Processing and Computer Vision. PhD thesis, Department of Computer Science, Stanford University, 2014.
[28]
R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the conference on empirical methods in natural language processing (EMNLP), volume 1631, page 1642. Citeseer, 2013.
[29]
S. Sohn, W. Kim, D. C. Comeau, and W. J. Wilbur. Optimal training sets for bayesian prediction of MeSH assignment. Journal of the American Medical Informatics Association, 15(4):546--553, 2008.
[30]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929--1958, 2014.
[31]
M. Steyvers, P. Smyth, M. Rosen-Zvi, and T. Griffiths. Probabilistic author-topic models for information discovery. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 306--315. ACM, 2004.
[32]
G. Tsatsaronis, G. Balikas, P. Malakasiotis, I. Partalas, M. Zschunke, M. R. Alvers, D. Weissenborn, A. Krithara, S. Petridis, D. Polychronopoulos, et al. An overview of the bioasq large-scale biomedical semantic indexing and question answering competition. BMC bioinformatics, 16(1):138, 2015.
[33]
A. J. Yepes and R. Berlanga. Knowledge based word-concept model estimation and refinement for biomedical text mining. Journal of biomedical informatics, 53:300--307, 2015.
[34]
A. J. Yepes, A. MacKinlay, J. Bedo, R. Garnavi, and Q. Chen. Deep belief networks and biomedical text categorisation. In Proceedings of the Twelfth Annual Workshop of the Australasia Language Technology Association, page 123, 2014.
[35]
A. J. Yepes, J. G. Mork, D. Demner-Fushman, and A. R. Aronson. Comparison and combination of several MeSH indexing approaches. In AMIA Annual Symposium Proceedings, volume 2013, page 709. American Medical Informatics Association, 2013.
[36]
M. Yetisgen-Yildiz and W. Pratt. The effect of feature representation on medline document classification. In Proceedings of AMIA Symposium, volume 2005, pages 849--853. American Medical Informatics Association, 2005.
[37]
M. D. Zeiler. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701, 2012.
[38]
M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus. Deconvolutional networks. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 2528--2535. IEEE, 2010.
[39]
M.-L. Zhang and Z.-H. Zhou. Multilabel neural networks with applications to functional genomics and text categorization. Knowledge and Data Engineering, IEEE Transactions on, 18(10):1338--1351, 2006.
[40]
Z.-H. Zhou. Ensemble methods: foundations and algorithms. CRC Press, 2012.

Cited By

View all
  • (2024)Predicting which patients with cancer will see a psychiatrist or counsellor from their initial oncology consultation document using natural language processingCommunications Medicine10.1038/s43856-024-00495-x4:1Online publication date: 8-Apr-2024
  • (2023)A Disease-Prediction Protocol Integrating Triage Priority and BERT-Based Transfer Learning for Intelligent TriageBioengineering10.3390/bioengineering1004042010:4(420)Online publication date: 27-Mar-2023
  • (2023)PhoBERTApplied Computer Systems10.2478/acss-2023-000428:1(35-43)Online publication date: 1-Jun-2023
  • Show More Cited By

Index Terms

  1. Convolutional neural networks for biomedical text classification: application in indexing biomedical articles

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics
      September 2015
      683 pages
      ISBN:9781450338530
      DOI:10.1145/2808719
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 09 September 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. convolutional neural networks
      2. medical subject headings
      3. text classification

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      BCB '15
      Sponsor:

      Acceptance Rates

      BCB '15 Paper Acceptance Rate 48 of 141 submissions, 34%;
      Overall Acceptance Rate 254 of 885 submissions, 29%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)49
      • Downloads (Last 6 weeks)8
      Reflects downloads up to 25 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Predicting which patients with cancer will see a psychiatrist or counsellor from their initial oncology consultation document using natural language processingCommunications Medicine10.1038/s43856-024-00495-x4:1Online publication date: 8-Apr-2024
      • (2023)A Disease-Prediction Protocol Integrating Triage Priority and BERT-Based Transfer Learning for Intelligent TriageBioengineering10.3390/bioengineering1004042010:4(420)Online publication date: 27-Mar-2023
      • (2023)PhoBERTApplied Computer Systems10.2478/acss-2023-000428:1(35-43)Online publication date: 1-Jun-2023
      • (2023)Biomedical Text Classification Using Augmented Word Representation Based on Distributional and Relational ContextsComputational Intelligence and Neuroscience10.1155/2023/29897912023Online publication date: 1-Jan-2023
      • (2023)APIRO: A Framework for Automated Security Tools API RecommendationACM Transactions on Software Engineering and Methodology10.1145/351276832:1(1-42)Online publication date: 13-Feb-2023
      • (2023)Fine-tuned generative LLM oversampling can improve performance over traditional techniques on multiclass imbalanced text classification2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386772(5181-5186)Online publication date: 15-Dec-2023
      • (2023)DEEP LEARNING-BASED CUSTOMER COMPLAINT MANAGEMENTJournal of Organizational Computing and Electronic Commerce10.1080/10919392.2023.221004932:3-4(217-231)Online publication date: 6-Jun-2023
      • (2023)Integrating domain knowledge for biomedical text analysis into deep learning: A surveyJournal of Biomedical Informatics10.1016/j.jbi.2023.104418143(104418)Online publication date: Jul-2023
      • (2023)Combining Regular Expressions and Supervised Algorithms for Clinical Text ClassificationIntelligent Data Engineering and Automated Learning – IDEAL 202310.1007/978-3-031-48232-8_35(381-392)Online publication date: 15-Nov-2023
      • (2023)Prediction of Colon Cancer Related Tweets Using Deep Learning ModelsIntelligent Systems Design and Applications10.1007/978-3-031-27440-4_50(522-532)Online publication date: 31-May-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media