research-article

Enhancing citation recommendation using citation network embedding

Authors:

Chanathip Pornprasit,

Pattararat Kiattipadungkul,

Natthawut Kertkeidkachorn,

Kyoung-Sook Kim,

Thanapon Noraset,

Saeed-Ul Hassan,

Suppawong TuarobAuthors Info & Claims

Scientometrics, Volume 127, Issue 1

Pages 233 - 264

https://doi.org/10.1007/s11192-021-04196-3

Published: 01 January 2022 Publication History

Abstract

Automatic recommendation of citations has been a focal point of research in scholarly digital libraries. Many graph-based citation recommendation algorithms have been proposed; however, most of them utilize local citation behavior from the citation network that results in recommending papers in the same proximity as the query article. In this paper, we propose to capture the global citation behavior in the citation network and use it to enhance the citation recommendation performance. Specifically, we develop a novel citation network embedding algorithm, ConvCN, to encode the citation relationship among papers. We then propose to enhance existing graph-based citation recommendation algorithms by incorporating ConvCN to improve the recommendation efficacy. ConvCN has been shown to improve the citation recommendation performance by 44.86% and 34.87% on average in terms of Bpref and F-measure@20, respectively. The findings from this research not only confirm that global citation behavior could be additionally useful for improving the performance of traditional citation recommendation algorithms but also shed light on the possibility to adapt the proposed ConvCN algorithm for other recommendation tasks that rely on graph-like information such as items recommendation in social networks and people recommendation in referral networks.

References

[1]

Agrawal, A., George, R. A., Ravi, S. S., Kamath, S., & Kumar, A. (2019). Ars_nitk at mediqa 2019: Analysing various methods for natural language inference, recognising question entailment and medical question answering system. In Proceedings of the 18th BioNLP workshop and shared task (pp. 533–540).

[2]

Ali Z, Qi G, Muhammad K, Ali B, and Abro WA Paper recommendation based on heterogeneous network embedding Knowledge-Based Systems 2020 210 106438

[3]

Ali Z, Qi G, Muhammad K, Kefalas P, and Khusro S Global citation recommendation employing generative adversarial network Expert Systems with Applications 2021 180 114888

[4]

Amjad, T., Daud, A., Che, D., & Akram, A. (2016). Muice: Mutual influence and citation exclusivity author rank. Information Processing & Management (pp. 374–386).

[5]

Bhagavatula, C., Feldman, S., Power, R., & Ammar, W. (2018a). Content-based citation recommendation. In Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: Human language technologies, Volume 1 (Long Papers) (pp. 238–251). New Orleans, Louisiana. Association for Computational Linguistics. URL https://aclanthology.org/N18-1022.

[6]

Bhagavatula, C., Feldman, S., Power, R., & Ammar, W. (2018b). Content-based citation recommendation. CoRR.

[7]

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dallocation. Journal of machine Learning research (pp. 993–1022).

[8]

Bordes, A., Usunier, N., García-Durán, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling multi-relational data. In NIPS (pp. 2787–2795).

[9]

Bramsen, P., Deshpande, P., Lee, Y. K., & Barzilay, R. (2006). Inducing temporal graphs. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 189–198).

[10]

Cai H, Zheng VW, and Chang KC-C A comprehensive survey of graph embedding: Problems, techniques, and applications IEEE Transactions on Knowledge and Data Engineering 2018 30 9 1616-1637

[11]

Caragea, C., Silvescu, A., Mitra, P., & Giles, C. L. (2013). Can’t see the forest for the trees? A citation recommendation system. In Proceedings of the 13th ACM/IEEE-CS joint conference on digital libraries (pp. 111–114).

[12]

Chakraborty, T., Modani, N., Narayanam, R., & Nagar, S. (2015). Discern: A diversified citation recommendation system for scientific queries. In 2015 IEEE 31st international conference on data engineering (pp. 555–566).

[13]

Chen, J., & Zhuge, H. (2014). Summarization of scientific documents by detecting common facts in citations. Future Generation Computer Systems (pp. 246–252).

[14]

Chen, E., Tang, X., & Fu, B. (2018). A modified pedestrian retrieval method based on faster r-cnn with integration of pedestrian detection and re-identification. In 2018 International conference on audio, language and image processing (ICALIP) (pp. 63–66). IEEE.

[15]

Chen, X., Zhao, H.-J., Zhao, S., Chen, J., & Zhang, Y.-P. (2019). Citation recommendation based on citation tendency. Scientometrics (pp. 937–956).

[16]

Choi, J., Kim, T., & Lee, S.-G. (2018). Element-wise bilinear interaction for sentence matching. In Proceedings of the seventh joint conference on lexical and computational semantics (pp. 107–112).

[17]

Cohan, A., Feldman, S., Beltagy, I., Downey, D., & Weld, D. S. (2020). Specter: Document-level representation learning using citation-informed transformers. In Proceedings of the 58th annual meeting of the association for computational linguistics (ACL 2020).

[18]

Dai, T., Zhu, L., Wang, Y., & Carley, K. M. (2020). Attentive stacked denoising autoencoder with bi-lstm for personalized context-aware citation recommendation. IEEE/ACM Transactions on Audio, Speech, and Language Processing (pp. 553–568).

[19]

Dettmers, T., Pasquale, M., Pontus, S., & Riedel, S. (2018). Convolutional 2d knowledge graph embeddings. In Proceedings of the 32th AAAI conference on artificial intelligence (pp. 1811–1818).

[20]

Eto, M. (2019). Extended co-citation search: Graph-based document retrieval on a co-citation network containing citation context information. Information Processing & Management.

[21]

Fiala, D. (2010). Mining citation information from citeseer data. Scientometrics (pp. 553–562).

[22]

Frost, C. O. (1979). The use of citations in literary research: A preliminary classification of citation functions. The Library Quarterly (pp. 399–414).

[23]

Gao, Y., Wu, Q., & Zhu, L. (2020). Merging the citations received by arxiv-deposited e-prints and their corresponding published journal articles: Problems and perspectives. Information Processing & Management.

[24]

Gipp, B. (2014). Citation-based plagiarism detection. In Citation-based plagiarism detection (pp. 57–88).

[25]

Gori, M., & Pucci, A. (2006). Research paper recommender systems: A random-walk based approach. In 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI’06) (pp. 778–781).

[26]

Grover, A., & Leskovec, J. (2016). Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 855-864).

[27]

Hamid I, Wu Yu, Nawaz Q, and Zhao R A fast heuristic detection algorithm for visualizing structure of large community Journal of Computational Science 2018 25 280-288

[28]

Haruna K, Ismail MA, Qazi A, Kakudi HA, Hassan M, Muaz SA, and Chiroma H Research paper recommender system based on public contextual metadata Scientometrics 2020 125 1 101-114

[29]

He, Q., Chen, B., Pei, J., Qiu, B., Mitra, P., & Giles, L. (2009). Detecting topic evolution in scientific literature: How can citations help? In Proceedings of the 18th ACM conference on information and knowledge management (pp. 957–966).

[30]

He, Q., Pei, J., Kifer, D., Mitra, P., & Giles, L. (2010). Context-aware citation recommendation. In Proceedings of the 19th international conference on world wide web (pp. 421–430). New York, NY, USA. Association for Computing Machinery.

[31]

Huang, W., Kataria, S., Caragea, C., Mitra, P., Giles, C. L., & Rokach, L. (2012). Recommending citations: Translating papers into references. In Proceedings of the 21st ACM international conference on information and knowledge management (pp. 1910–1914).

[32]

Huang, W., Wu, Z., Liang, C., Mitra, P., & Giles, C L. (2015). A neural probabilistic model for context based citation recommendation. In Twenty-ninth AAAI conference on artificial intelligence.

[33]

Huang, W., Wu, Z., Mitra, P., & Giles, C L. (2014). Refseer: A citation recommendation system. In IEEE/ACM joint conference on digital libraries (pp. 371–374). IEEE.

[34]

Jeong C, Jang S, Park E, and Choi S A context-aware citation recommendation model with bert and graph convolutional networks Scientometrics 2020 124 3 1907-1922

[35]

Jia, H., & Saule, E. (2017). An analysis of citation recommender systems: Beyond the obvious. In Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017 (pp. 216–223).

[36]

Jia, H., & Saule, E. (2018a). Local is good: A fast citation recommendation approach. In P. Gabriella, P. Benjamin, A. Leif, & H. Allan (Eds.), Advances in information retrieval (pp. 758–764).

[37]

Jia, H., & Saule, E. (2018b). Local is good: A fast citation recommendation approach. In European conference on information retrieval (pp. 758–764). Springer.

[38]

Jiang, Z., Liu, X., & Gao, L. (2015). Chronological citation recommendation with information-need shifting. In Proceedings of the 24th ACM international on conference on information & knowledge management (pp. 1291–1300).

[39]

Jiang, Z., Yin, Y. Gao, L., Lu, Y., & Liu, X. (2018). Cross-language citation recommendation via hierarchical representation learning on heterogeneous graph. In The 41st international ACM SIGIR conference on research & development in information retrieval (pp. 635–644).

[40]

Jiang, X., Zhu, R., Li, S., & Ji, P. (2020). Co-embedding of nodes and edges with graph neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]

Kataria, S., Mitra, P., & Bhatia, S. (2010). Utilizing context in generative bayesian models for linked corpus. In Twenty-fourth AAAI conference on artificial intelligence.

[42]

Keshavarz, H., Seifi, S. T., & Izadi, M. (2019). A deep learning-based approach for measuring the domain similarity of persian texts. arXiv preprintarXiv:1909.09690.

[43]

Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. CoRR.

[44]

Kobayashi, Y., Shimbo, M., & Matsumoto, Y. (2018). Citation recommendation using distributed representation of discourse facets in scientific articles. In Proceedings of the 18th ACM/IEEE on joint conference on digital libraries (pp. 243–251).

[45]

Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015). Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the twenty-ninth AAAI conference on artificial intelligence (pp. 2181–2187).

[46]

Liu, H., Kou, H., Yan, C., & Qi, L. (2019). Link prediction in paper citation network to construct paper correlation graph. EURASIP Journal on Wireless Communications and Networking (p. 233).

[47]

Ma, N., Guan, J., & Zhao, Y. (2008). Bringing pagerank to the citation analysis. Information Processing & Management (pp. 800–810).

[48]

Ma, A., You, F., Jing, M., Li, J., & Lu, K. (2020). Multi-source domain adaptation with graph embedding and adaptive label prediction. Information Processing & Management (p. 102367).

[49]

McNee, S. M., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S. K., Rashid, A. M., Konstan, J. A., & Riedl, J. (2002). On the recommending of citations for research papers. In Proceedings of the 2002 ACM conference on computer supported cooperative work (pp. 116–125).

[50]

Meng, F., Gao, D., Li, W., Sun, X., & Hou, Y. (2013). A unified graph model for personalized query-oriented reference paper recommendation. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management (pp. 1509–1512).

[51]

Miller, G. A. (1995). Wordnet: A lexical database for English. Commun. ACM (pp. 39–41).

[52]

Naak, A., Hage, H., & Aïmeur, E. (2009). A multi-criteria collaborative filtering approach for research paper recommendation in papyres. In Gilbert Babin, Peter Kropf, and Michael Weiss, editors, E-Technologies: Innovation in an Open World (pp. 25–39).

[53]

Najafabadi MK, Mohamed A, and Onn CW An impact of time and item influencer in collaborative filtering recommendations using graph-based model Information Processing & Management 2019 56 3 526-540

[54]

Nallapati, R. M., Ahmed, A., Xing, E. P., & Cohen, W. W. (2008). Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 542–550).

[55]

Nguyen, D. Q., Nguyen, T. D., Nguyen, D. Q., & Phung, D. (2018). A novel embedding model for knowledge base completion based on convolutional neural network. In The 16th annual conference of the North American Chapter of the Association for computational linguistics: Human language technologies (NAACL-HLT) (pp. 327–333).

[56]

Nickel, M., Tresp, V., & Kriegel, H.-P. (2011). A three-way model for collective learning on multi-relational data. In Proceedings of the 28th international conference on international conference on machine Learning (pp. 809–816).

[57]

Nozza D, Fersini E, and Messina E Cage: Constrained deep attributed graph embedding Information Sciences 2020 518 56-70

[58]

Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. In WWW 1999.

[59]

Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 701–710).

[60]

Pinski, G., & Narin, F. (1976). Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing & Management (pp. 297–312).

[61]

Pornprasit, C., Liu, X., Kertkeidkachorn, N., Kim, K.-S., Noraset, T., & Tuarob, S. (2020). Convcn: A cnn-based citation network embedding algorithm towards citation recommendation. In Proceedings of the ACM/IEEE joint conference on digital libraries in 2020 (pp. 433–436).

[62]

Qian, Y., Liu, Y., Xu, X., & Sheng, Q. Z. (2020). Leveraging citation influences for modeling scientific documents. World Wide Web (pp. 1–22).

[63]

Savov, P., Jatowt, A., & Nielek, R. (2020). Identifying breakthrough scientific papers. Information Processing & Management.

[64]

Schafer, J. B., Frankowski, D., Herlocker, J., & Sen, S. (2007). Collaborative Filtering Recommender Systems (pp. 291–324).

[65]

Seeger, M. (2003). Bayesian gaussian process models: Pac-bayesian generalisation error bounds and sparse approximations.

[66]

Seglen, P. O. (1997). Citations and journal impact factors: Questionable indicators of research quality. Allergy (pp. 1050–1056).

[67]

Singh V, Verma S, and Chaurasia SS Mapping the themes and intellectual structure of corporate university: Co-citation and cluster analyses Scientometrics 2020 122 3 1275-1302

[68]

Tabrizi, S. A., Shakery, A., Zamani, H., & Tavallaei, M. A. (2018). Person: Personalized information retrieval evaluation based on citation networks. Information Processing & Management (pp. 630–656).

[69]

Tang, J., & Zhang, J. (2009). A discriminative approach to topic-based citation recommendation. In Thanaruk Theeramunkong, Boonserm Kijsirikul, Nick Cercone, and Tu-Bao Ho, editors, Advances in Knowledge Discovery and Data Mining (pp. 572–579).

[70]

Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). Line: Large-scale information network embedding. In Proceedings of the 24th international conference on world wide web (pp. 1067–1077).

[71]

Tang, J., Sun, J., Wang, C., & Yang, Z. (2009). Social influence analysis in large-scale networks. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 807–816).

[72]

Taşkın, Z., & Al, U. (2018). A content-based citation analysis study based on text categorization. Scientometrics (pp. 335–357).

[73]

Torres, R., McNee, S. M., Abel, M., Konstan, J. A., & Riedl, J. (2004). Enhancing digital libraries with techlens+. In Proceedings of the 4th ACM/IEEE-CS joint conference on digital libraries (pp. 228–236).

[74]

Tuarob, S., Bhatia, S., Mitra, P., & Giles, C. L. (2016). Algorithmseer: A system for extracting and searching for algorithms in scholarly big data. IEEE Transactions on Big Data (pp. 3–17).

[75]

Tuarob, S., Mitra, P., & Giles, C. L. (2012). Improving algorithm search using the algorithm co-citation network. In Proceedings of the 12th ACM/IEEE-CS joint conference on digital libraries (pp. 277–280).

[76]

Tuarob, S., Pouchard, L. C., & Giles, C. L. (2013). Automatic tag recommendation for metadata annotation using probabilistic topic modeling. In Proceedings of the 13th ACM/IEEE-CS joint conference on digital libraries (pp. 239–248).

[77]

Tuarob, S., Pouchard, L. C., Mitra, P., & Giles, C. L. (2015). A generalized topic modeling approach for automatic document annotation. International Journal on Digital Libraries (pp. 111–128).

[78]

Tuarob S, Kang SW, Wettayakorn P, Pornprasit C, Sachati T, Hassan SU, and Haddawy P Automatic classification of algorithm citation functions in scientific literature IEEE Transactions on Knowledge and Data Engineering 2020 32 10 1881-1896

[79]

Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014). Knowledge graph embedding by translating on hyperplanes. In Proceedings of the twenty-eighth AAAI conference on artificial intelligence (pp. 1112–1119).

[80]

Wang, J., Zhu, L., Dai, T., & Wang, Y. (2020). Deep memory network with bi-lstm for personalized context-aware citation recommendation. Neurocomputing (pp. 103–113).

[81]

Yan E and Ding Y Discovering author impact: A pagerank perspective Information Processing & Management 2011 47 1 125-134

[82]

Yang, C., Wei, B., Wu, J., Zhang, Y., & Zhang, L. (2009). Cares: A ranking-oriented cadal recommender system. In Proceedings of the 9th ACM/IEEE-CS joint conference on digital libraries (pp. 203–212).

[83]

Zhang, Y., & Ma, Q. (2020). Doccit2vec: Citation recommendation via embedding of content and structural contexts. IEEE Access (pp. 115865–115875).

[84]

Zhang, S., Zhao, D., Cheng, R., Cheng, J., & Wang, H. (2016). Finding influential papers in citation networks. In 2016 IEEE first international conference on data science in cyberspace (DSC) (pp. 658–662).

[85]

Zhou, D., Zhu, S., Yu, K., Song, X., Tseng, B. L, Zha, H., & Giles, C. L. (2008). Learning multiple graphs for document recommendations. In Proceedings of the 17th international conference on World Wide Web (pp. 141–150).

[86]

Zhu Q, Zhou X, Zhang P, and Shi Y A neural translating general hyperplane for knowledge graph embedding Journal of computational science 2019 30 108-117

Cited By

Ruenin PChoetkiertikul MSupratak ATuarob S(2024)TeReKGKnowledge-Based Systems10.1016/j.knosys.2024.111492289:COnline publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111492
Zhang XSong SXiong Y(2024)Personalized global citation recommendation with diversification awarenessScientometrics10.1007/s11192-024-05057-5129:7(3625-3657)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s11192-024-05057-5
Huang ZTang DZhao RRao W(2024)A scientific paper recommendation method using the time decay heterogeneous graphScientometrics10.1007/s11192-024-04933-4129:3(1589-1613)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s11192-024-04933-4
Show More Cited By

Recommendations

Neural Citation Network for Context-Aware Citation Recommendation
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

The accelerating rate of scientific publications makes it difficult to find relevant citations or related work. Context-aware citation recommendation aims to solve this problem by providing a curated list of high-quality candidates given a short passage ...
ConvCN: A CNN-Based Citation Network Embedding Algorithm towards Citation Recommendation
JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020

One of the most time-consuming tasks that researchers usually have to undergo is finding existing, relevant papers to study and cite in their articles. Manual effort that involves searching relevant papers using keywords not only is time-consuming, but ...
A Systematic Review of Citation Recommendation Over the Past Two Decades

A citation is a reference to the source of information used in an article. Citations are very useful for students and researchers to locate relevant information on a topic. Proper citation is also important in the academic ethics of article writing. Due ...

Comments

Information & Contributors

Information

Published In

cover image Scientometrics

Scientometrics Volume 127, Issue 1

Jan 2022

668 pages

ISSN:0138-9130

Issue’s Table of Contents

© Akadémiai Kiadó, Budapest, Hungary 2021.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 January 2022

Accepted: 25 October 2021

Received: 20 November 2020

Author Tags

Qualifiers

Research-article

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ruenin PChoetkiertikul MSupratak ATuarob S(2024)TeReKGKnowledge-Based Systems10.1016/j.knosys.2024.111492289:COnline publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111492
Zhang XSong SXiong Y(2024)Personalized global citation recommendation with diversification awarenessScientometrics10.1007/s11192-024-05057-5129:7(3625-3657)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s11192-024-05057-5
Huang ZTang DZhao RRao W(2024)A scientific paper recommendation method using the time decay heterogeneous graphScientometrics10.1007/s11192-024-04933-4129:3(1589-1613)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s11192-024-04933-4
Thierry NBao BAli Z(2023)RAR-SB: research article recommendation using SciBERT with BiGRUScientometrics10.1007/s11192-023-04840-0128:12(6427-6448)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s11192-023-04840-0
Lu YYuan MLiu JChen M(2023)Research on semantic representation and citation recommendation of scientific papers with multiple semantics fusionScientometrics10.1007/s11192-022-04566-5128:2(1367-1393)Online publication date: 1-Feb-2023
https://dl.acm.org/doi/10.1007/s11192-022-04566-5
Mei XCai XXu SLi WPan SYang L(2022)Mutually reinforced network embeddingExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.117616204:COnline publication date: 15-Oct-2022
https://dl.acm.org/doi/10.1016/j.eswa.2022.117616
Li XZhao CHu ZYu CDuan X(2022)Revealing the character of journals in higher-order citation networksScientometrics10.1007/s11192-022-04518-z127:11(6315-6338)Online publication date: 1-Nov-2022
https://dl.acm.org/doi/10.1007/s11192-022-04518-z

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents