Abstract
Many would argue that the currency of research is citations; however, researchers and funding organizations alike are lacking tools with which they can explore how this currency translates to funding opportunities. Motivated by this need, in this paper we address one of the fundamental problems facing the development of such a tool, namely the problem of automatically extracting funding information from scientific articles. For this purpose, we experiment with a two-stage framework which ingests text, filters paragraphs which contain funding information, and then combines sequential learning methods to detect named entities in a novel ensemble approach. We present a comparative analysis of each independent component of this pipeline, named FundingFinder, the results of which indicate that the said pipeline can extract the funding organizations and the associated grants, from scientific articles, accurately and efficiently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
References
Chieu, H.L.: Named entity recognition: a maximum entropy approach using global information. In: Proceedings of the 2002 International Conference on Computational Linguistics, pp. 190–196 (2002)
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37 (1960)
Curran, J.R.: From distributional to semantic similarity. Ph.D. thesis, University of Edinburgh (2003)
Giles, C.L., Councill, I.G.: Who gets acknowledged: measuring scientific contributions through automatic acknowledgment indexing. Proc. Natl. Acad. Sci. U.S.A. 101, 17599–17604 (2004)
Jonnalagadda, S., Topham, P.: NEMO: extraction and normalization of organization names from pubmed affiliation strings. J. Biomed. Discov. Collab. 5, 50–75 (2010)
McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, vol. 4, pp. 188–191 (2003)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 3111–3119 (2013)
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Linguisticae Investig. 30(1), 3–26 (2007)
Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33, 1–39 (2010)
Yu, W., Yesupriya, A., Wulf, A., Qu, J., Gwinn, M., Khoury, M.J.: An automatic method to generate domain-specific investigator networks using pubmed abstracts. BMC Med. Inf. Decis. Making 7(1), 17 (2007)
Zhou, G., Su, J.: Named entity recognition using an HMM-based chunk tagger. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 473–480 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Kayal, S., Afzal, Z., Tsatsaronis, G., Doornenbal, M., Katrenko, S., Gregory, M. (2019). A Framework to Automatically Extract Funding Information from Text. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2018. Lecture Notes in Computer Science(), vol 11331. Springer, Cham. https://doi.org/10.1007/978-3-030-13709-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-13709-0_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13708-3
Online ISBN: 978-3-030-13709-0
eBook Packages: Computer ScienceComputer Science (R0)