Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3589334.3645596acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Author Name Disambiguation via Paper Association Refinement and Compositional Contrastive Embedding

Published: 13 May 2024 Publication History

Abstract

Author name disambiguation (AND) is an essential task for online academic retrieval systems. Recent models adopt representation learning in the author's name disambiguation. Despite achieving remarkable success, these methods may be limited in two aspects. First, the heuristically constructed paper association graphs used for representation learning contain uncertainties that may cause negative supervision. Second, existing algorithms, such as binary cross-entropy loss, used to train representation learning models may not produce sufficiently high-quality representations for AND. To tackle the above problems, we propose an association refining and compositional contrasting (ARCC) framework for AND tasks. ARCC first adopts an iterative graph structure refinement process to dynamically reduce the uncertainties in paper graphs. Then, a compositional contrastive learning method is proposed to encourage learning more discriminative representations for AND. Empirical studies on two benchmark datasets suggest that ARCC is effective for AND and outperforms the state-of-the-art models.

Supplemental Material

MP4 File
Supplemental video

References

[1]
Bo Chen, Jing Zhang, Fanjin Zhang, Tianyi Han, Yuqing Cheng, Xiaoyan Li, Yuxiao Dong, and Jie Tang. 2023. Web-Scale Academic Name Disambiguation: The WhoIsWho Benchmark, Leaderboard, and Toolkit. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '23). Association for Computing Machinery, New York, NY, USA, 3817--3828. https://doi.org/10.1145/3580305.3599930
[2]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020a. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 119), Hal Daumé III and Aarti Singh (Eds.). PMLR, 1597--1607. https://proceedings.mlr.press/v119/chen20j.html
[3]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020b. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 1597--1607. http://proceedings.mlr.press/v119/chen20j.html
[4]
Ya Chen, Hongliang Yuan, Tingting Liu, and Nan Ding. 2021. Name disambiguation based on graph convolutional network. Scientific Programming, Vol. 2021 (2021), 1--11.
[5]
DBLP. 2023. [Online]. http://dblp.uni-trier.de/.
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://doi.org/10.18653/v1/N19--1423
[7]
Xiaoming Fan, Jianyong Wang, Xu Pu, Lizhu Zhou, and Bing Lv. 2011. On graph-based name disambiguation. Journal of Data and Information Quality (JDIQ), Vol. 2, 2 (2011), 1--23.
[8]
Google Scholar. 2023. [Online]. https://scholar.google.com/.
[9]
Florian Graf, Christoph Hofer, Marc Niethammer, and Roland Kwitt. 2021. Dissecting supervised contrastive learning. In International Conference on Machine Learning. PMLR, 3821--3830.
[10]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.
[11]
Hui Han, Lee Giles, Hongyuan Zha, Cheng Li, and Kostas Tsioutsiouliklis. 2004. Two supervised learning approaches for name disambiguation in author citations. In Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004. IEEE, 296--305.
[12]
Jian Huang, Seyda Ertekin, and C Lee Giles. 2006. Efficient name disambiguation for large-scale databases. In European conference on principles of data mining and knowledge discovery. Springer, 536--544.
[13]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. Advances in neural information processing systems, Vol. 33 (2020), 18661--18673.
[14]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6980
[15]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[16]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=SJU4ayYgl
[17]
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International conference on machine learning. PMLR, 1188--1196.
[18]
Gilles Louppe, Hussein T Al-Natsheh, Mateusz Susik, and Eamonn James Maguire. 2016. Ethnicity sensitive author disambiguation using semi-supervised learning. In international conference on knowledge engineering and the semantic web. Springer, 272--287.
[19]
Yingying Ma, Youlong Wu, and Chengqiang Lu. 2020. A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory. Entropy, Vol. 22, 4 (2020). https://doi.org/10.3390/e22040416
[20]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[21]
Km Pooja, Samrat Mondal, and Joydeep Chandra. 2022. Exploiting Higher Order Multi-Dimensional Relationships with Self-Attention for Author Name Disambiguation. ACM Trans. Knowl. Discov. Data, Vol. 16, 5, Article 88 (mar 2022), 23 pages.
[22]
Cristian Santini, Genet Asefa Gesese, Silvio Peroni, Aldo Gangemi, Harald Sack, and Mehwish Alam. 2022. A knowledge graph embeddings based approach for author name disambiguation using literals. Scientometrics, Vol. 127, 8 (2022), 4887--4912.
[23]
Mengxiao Song, Bowen Yu, Li Quangang, Wang Yubin, Tingwen Liu, and Hongbo Xu. 2022. Enhancing Joint Multiple Intent Detection and Slot Filling with Global Intent-Slot Co-occurrence. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 7967--7977. https://doi.org/10.18653/v1/2022.emnlp-main.543
[24]
Jie Tang, Jing Zhang, Duo Zhang, and Juanzi Li. 2008. A unified framework for name disambiguation. In Proceedings of the 17th international conference on World Wide Web. 1205--1206.
[25]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).
[26]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=rJXMpikCZ
[27]
Haiwen Wang, Ruijie Wan, Chuan Wen, Shuhao Li, Yuting Jia, Weinan Zhang, and Xinbing Wang. 2020. Author name disambiguation on heterogeneous information network with adversarial representation learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 238--245.
[28]
Tongzhou Wang and Phillip Isola. 2020. Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 9929--9939. http://proceedings.mlr.press/v119/wang20k.html
[29]
Joe H Ward Jr. 1963. Hierarchical grouping to optimize an objective function. Journal of the American statistical association, Vol. 58, 301 (1963), 236--244.
[30]
Bo Xiong, Peng Bao, and Yilin Wu. 2021. Learning semantic and relationship joint embedding for author name disambiguation. Neural Computing and Applications, Vol. 33 (2021), 1987--1998.
[31]
Minoru Yoshida, Masaki Ikeda, Shingo Ono, Issei Sato, and Hiroshi Nakagawa. 2010. Person name disambiguation by bootstrapping. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. 10--17.
[32]
Baichuan Zhang and Mohammad Al Hasan. 2017. Name disambiguation in anonymized graphs using network embedding. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 1239--1248.
[33]
Yutao Zhang, Fanjin Zhang, Peiran Yao, and Jie Tang. 2018. Name Disambiguation in AMiner: Clustering, Maintenance, and Human in the Loop. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1002--1011.
[34]
Zhiqiang Zhang, Chunqi Wu, Zhao Li, Juanjuan Peng, Haiyan Wu, Haiyu Song, Shengchun Deng, and Biao Wang. 2021. Author Name Disambiguation Using Multiple Graph Attention Networks. In 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--8.
[35]
Zhenyu Zhang, Bowen Yu, Tingwen Liu, and Dong Wang. 2020. Strong Baselines for Author Name Disambiguation with and Without Neural Networks. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 369--381.
[36]
Qian Zhou, Wei Chen, Weiqing Wang, Jiajie Xu, and Lei Zhao. 2021. Multiple Features Driven Author Name Disambiguation. (2021), 506--515. https://doi.org/10.1109/ICWS53863.2021.00071

Index Terms

  1. Author Name Disambiguation via Paper Association Refinement and Compositional Contrastive Embedding

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '24: Proceedings of the ACM Web Conference 2024
    May 2024
    4826 pages
    ISBN:9798400701719
    DOI:10.1145/3589334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. author name disambiguation
    2. contrastive learning
    3. graph structure refinement

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    WWW '24
    Sponsor:
    WWW '24: The ACM Web Conference 2024
    May 13 - 17, 2024
    Singapore, Singapore

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 136
      Total Downloads
    • Downloads (Last 12 months)136
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 24 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media