Towards employing native information in citation function classification

Zhang, Yang; Zhao, Rongying; Wang, Yufei; Chen, Haihua; Mahmood, Adnan; Zaib, Munazza; Zhang, Wei Emma; Sheng, Quan Z.

doi:10.1007/s11192-021-04242-0

Towards employing native information in citation function classification

Published: 16 January 2022

Volume 127, pages 6557–6577, (2022)
Cite this article

Scientometrics Aims and scope Submit manuscript

Yang Zhang^1,2,
Rongying Zhao¹,
Yufei Wang²,
Haihua Chen³,
Adnan Mahmood²,
Munazza Zaib²,
Wei Emma Zhang⁴ &
…
Quan Z. Sheng²

1177 Accesses
Explore all metrics

A Correction to this article was published on 18 July 2022

This article has been updated

Abstract

Citations play a fundamental role in supporting authors’ contribution claims throughout a scientific paper. Labelling citation instances with different function labels is indispensable for understanding a scientific text. A single citation is the linkage between two scientific papers in the citation network. These citations encompass rich native information, including context of the citation, citation location, citing and cited paper titles, DOI, and the website’s URL. Nevertheless, previous studies have ignored such rich native information during the process of datasets’ accumulation, thereby resulting in a lack of comprehensive yet significantly valuable features for the citation function classification task. In this paper, we argue that such important information should not be ignored, and accordingly, we extract and integrate all of the native information features into different neural text representation models via trainable embeddings and free text. We first construct a new dataset entitled, NI-Cite, comprising a large number of labelled citations with five key native features (Citation Context, Section Name, Title, DOI, Web URL) against each dataset instance. In addition, we propose to exploit the recently developed text representation models integrated with such information to evaluate the performance of citation function classification task. The experimental results demonstrate that the native information features suggested in this paper enhance the overall classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Contextualised segment-wise citation function classification

Article 12 July 2023

A Deep Multi-Tasking Approach Leveraging on Cited-Citing Paper Relationship For Citation Intent Classification

Article Open access 13 December 2023

Citation Function Classification Based on Ontologies and Convolutional Neural Networks

Change history

18 July 2022
A Correction to this paper has been published: https://doi.org/10.1007/s11192-022-04451-1

Notes

References

Abu-Jbara, A., & Radev, D. (2012). Reference scope identification in citing sentences. In Proceedings of the 2012 conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 80–90).
Agarwal, S., Choubey, L., & Yu, H. (2010). Automatically classifying the role of citations in biomedical articles. In Proceedings of American Medical Informatics Association fall symposium (pp. 11–15).
Alikaniotis, D., Yannakoudakis, H., & Rei, M. (2016). Automatic text scoring using neural networks. In Proceedings of the 54th annual meeting of the Association for Computational Linguistics (pp. 715–725).
Beltagy, I., Lo, K., & Cohan, A. (2019). SciBERT: A pretrained language model for scientific text. Retrieved from arXiv:1903.10676
Bertin, M., Atanassova, I., Gingras, Y., & Lariviere, V. (2016). The invariant distribution of references in scientific articles. Journal of the American Society for Information Science and Technology, 67(1), 164–177.
Google Scholar
Bornmann, L., & Daniel, H. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.
Article Google Scholar
Cohan, A., Ammar, W., van Zuylen, M., & Cady, F. (2019). Structural scaffolds for citation intent classification in scientific publications. In Proceedings of 2019 conference of the North American Chapter of the Association for Computational Linguistics (pp. 3586–3596).
Cohan, A., & Goharian, N. (2018). Scientific document summarization via citation contextualization and scientific discourse. International Journal on Digital Libraries, 19(2), 287–303.
Article Google Scholar
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Dong, C., & Schafer, U. (2011). Ensemble-style self-training on citation classification. In Proceedings of the 5th international joint conference on natural language processing (pp. 623–631).
Färber, M., & Jatowt, A. (2020). Citation recommendation: Approaches and datasets. International Journal on Digital Libraries, 21(1), 375–405.
Article Google Scholar
Garfield, E. (1965). Can citation indexing be automated? In M. E. Stevens, V. E. Giuliano, & L. B. Heilprin (Eds.), Statistical association methods for mechanical documentation. National Bureau of Standards.
Google Scholar
Garzone, M., & Mercer, R. E. (2000). Towards an automated citation classifier. In Proceedings the conference of the Canadian society for computational studies of intelligence (pp. 337–346). Springer.
Hassan, S., Akram, A., & Haddawy, P. (2017). Identifying important citations using contextual information from full text. In Proceedings of 2017 ACM/IEEE joint conference on digital libraries (pp. 1–8).
Hernández-Alvarez, M., & Gomez, M. J. (2016). Survey about citation context analysis: Tasks, techniques, and resources. Natural Language Engineering, 22(3), 327–349.
Article Google Scholar
Jochim, C., & Schiitz, H. (2012). Towards a generic and flexible citation classifier based on a faceted classification scheme. In Proceedings of the 2012 international conference on computational linguistics (pp. 1343–1358).
Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(27), 1–54. https://doi.org/10.1186/s40537-019-0192-5.
Article Google Scholar
Joshi, M., Chen, D., Liu, Y., Weld, D. S., Zettlemoyer, L., & Levy, O. (2020). Spanbert: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 8, 64–77.
Article Google Scholar
Jurgens, D., Kumar, S., Hoover, R., McFarland, D., & Jurafsky, D. (2018). Measuring the evolution of a scientific field through citation frame. Transactions of the Association for Computational Linguistics, 6, 391–406.
Article Google Scholar
Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceedings of the 2014 conference on empirical methods in natural language processing (pp. 1746–1751).
Lai, S., Xu, L., Liu, K., & Zhao, J. (2015). Recurrent convolutional neural networks for text classification. In Proceedings of twenty-ninth AAAI conference on artificial intelligence (pp. 2267–2273).
Lauscher, A., Ko, B., Kuehl, B., Johnson, S., Jurgens, D., Cohan, A., & Lo, K. (2021). MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting. arXiv preprint arXiv:2107.00414
Moed, H. F. (2006). Citation analysis in research evaluation (Vol. 9). Springer.
Google Scholar
Moravcsik, M. J., & Murugesan, P. (1975). Some results of the function and quality of citations. Social Studies of Science, 5(1), 86–92.
Article Google Scholar
Narin, F. (1976). Evaluative bibliometrics: The use of publication and citation analysis in the evaluation of scientific activity (pp. 334–337). Computer Horizons.
Google Scholar
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (pp. 1532–1543).
Pires, T., Schlinger, E., & Garrette, D. (2019). How multilingual is multilingual BERT? In Proceedings of the 57th annual meeting of the Association for Computational Linguistics (pp. 4996–5001).
Pride, D., & Knoth, P. (2017). Incidental or influential?—Challenges in automatically detecting citation importance using publication full texts. In Research and advanced technology for digital libraries (pp. 572–578). https://doi.org/10.1007/978-3-319-67008-9_48
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., & Matena, M., Zhou, Y., Li, W., & Liu, P. J. (2019). Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:1910.10683
Roman, M., Shahid, A., Khan, S., Koubaa, A., & Yu, L. (2021). Citation intent classification using word embedding. IEEE Access, 9, 9982–9995.
Article Google Scholar
Safder, I., Hassan, S. U., Visvizi, A., Noraset, T., Nawaz, R., & Tuarob, S. (2020). Deep learning-based extraction of algorithmic metadata in full-text scholarly documents. Information Processing & Management, 57, 102269.
Article Google Scholar
Smith, L. C. (1981). Citation analysis. Library Trends, 30(1), 83–106.
Google Scholar
Taylor, W. L. (1953). Cloze procedure: A new tool for measuring readability. Journalism Quarterly, 30, 415–433.
Article Google Scholar
Teufel, S., Siddharthan, A., & Tidhar, D. (2006). An annotation scheme for citation function. In Proceedings of the 7th SIGdial workshop on discourse and dialogue (pp. 80–87).
Teufel, S., Siddharthan, A., & Tidhar, D. (2019). Automatic classification of citation function. In Proceedings of 2006 conference on empirical methods in natural language processing (pp. 103–110).
Tuarob, S., Kang, S. W., Wettayakorn, P., Pornprasit, C., Sachati, T., Hassan, S. U., & Haddawy, P. (2019). Automatic classification of algorithm citation functions in scientific literature. IEEE Transactions on Knowledge and Data Engineering, 32(10), 1881–1896.
Article Google Scholar
Tuarob, S., Mitra, P., & Giles, C. L. (2013). A classification scheme for algorithm citation function in scholarly works. In Proceedings of the 13th ACM/IEEE-CS joint conference on digital libraries (pp. 367–368).
Tuarob, S., Mitra, P., & Giles, L. C. (2015). A hybrid approach to discover semantic hierarchical sections in scholarly documents. In Proceedings of the 13th international conference on document analysis and recognition (pp. 1081–1085).
Valenzuela, M., Ha, V., & Etzioni, O. (2015). Identifying meaningful citations. In Proceedings of AAAI workshop: Scholarly big data (pp. 13–18).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. In Proceedings of the 31st international conference advances in neural information processing systems (pp. 5998–6008).
Wang, Y., Johnson, M., Wan, S., Sun, Y., & Wang, W. (2019). How to best use syntax in semantic role labelling. In Proceedings of the 57th annual meeting of the Association for Computational Linguistics (pp. 5338–5343).
Weinstock, M. (1971). Citation indexes. In M. Drake (Ed.), Encyclopedia of library and information science (Vol. 5). Dekker.
Google Scholar
Yan, J. (2009). Text representation. In L. Liu & M. T. Özsu (Eds.), Encyclopedia of database systems (pp. 3069–3072). Springer.
Chapter Google Scholar
Yousif, A., Niu, Z., Tarus, J. K., & Ahmad, A. (2019). A survey on sentiment analysis of scientific citations. Artificial Intelligence Review, 52(1), 1805–1838. https://doi.org/10.1007/s10462-017-9597-8.
Article Google Scholar
Zhang, Y., Wang, Y., Sheng, Q. Z., Mahmood, A., Emma Zhang, W., & Zhao, R. (2021). TDM-CFC: Towards document-level multi-label citation function classification. In Proceedings of international conference on web information systems engineering (pp. 363–376).
Zhao, H., Luo, Z., Feng, C., Zheng, A., & Liu, X. (2019). A context-based framework for modeling the role and function of on-line resource citations in scientific literature. In Proceedings of 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (pp. 5209–5218).
Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2014). Measuring academic influence. Journal of the Association for Information Science and Technology, 66, 408–427.
Article Google Scholar

Download references

Acknowledgements

This research is funded by Australian Research Council (ARC) Discovery Project DP200102298 and the National Social Science Fund of China (No. 18ZDA325).

Author information

Authors and Affiliations

School of Information Management, Wuhan University, Wuhan, Hubei Province, People’s Republic of China
Yang Zhang & Rongying Zhao
School of Computing, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, 2109, Australia
Yang Zhang, Yufei Wang, Adnan Mahmood, Munazza Zaib & Quan Z. Sheng
Department of Information Science, University of North Texas, Denton, TX, 76207, USA
Haihua Chen
School of Computer Science, The University of Adelaide, North Terrace, Adelaide, SA, 5005, Australia
Wei Emma Zhang

Authors

Yang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Rongying Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yufei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haihua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Adnan Mahmood
View author publications
You can also search for this author in PubMed Google Scholar
Munazza Zaib
View author publications
You can also search for this author in PubMed Google Scholar
Wei Emma Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Quan Z. Sheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rongying Zhao.

Additional information

The original online version of this article was revised: In the original version the first affiliation was incorrectly linked to the author name, Adnan Mahmood.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Zhao, R., Wang, Y. et al. Towards employing native information in citation function classification. Scientometrics 127, 6557–6577 (2022). https://doi.org/10.1007/s11192-021-04242-0

Download citation

Received: 16 June 2021
Accepted: 08 December 2021
Published: 16 January 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s11192-021-04242-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards employing native information in citation function classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Contextualised segment-wise citation function classification

A Deep Multi-Tasking Approach Leveraging on Cited-Citing Paper Relationship For Citation Intent Classification

Citation Function Classification Based on Ontologies and Convolutional Neural Networks

Change history

18 July 2022

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Towards employing native information in citation function classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Contextualised segment-wise citation function classification

A Deep Multi-Tasking Approach Leveraging on Cited-Citing Paper Relationship For Citation Intent Classification

Citation Function Classification Based on Ontologies and Convolutional Neural Networks

Change history

18 July 2022

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation