research-article

Exploiting Higher Order Multi-dimensional Relationships with Self-attention for Author Name Disambiguation

Authors:

Km Pooja,

Samrat Mondal,

Joydeep ChandraAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 16, Issue 5

Article No.: 88, Pages 1 - 23

https://doi.org/10.1145/3502730

Published: 09 March 2022 Publication History

Get Access

Abstract

Name ambiguity is a prevalent problem in scholarly publications due to the unprecedented growth of digital libraries and number of researchers. An author is identified by their name in the absence of a unique identifier. The documents of an author are mistakenly assigned due to underlying ambiguity, which may lead to an improper assessment of the author. Various efforts have been made in the literature to solve the name disambiguation problem with supervised and unsupervised approaches. The unsupervised approaches for author name disambiguation are preferred due to the availability of a large amount of unlabeled data. Bibliographic data contain heterogeneous features, thus recently, representation learning-based techniques have been used in literature to embed heterogeneous features in common space. Documents of a scholar are connected by multiple relations. Recently, research has shifted from a single homogeneous relation to multi-dimensional (heterogeneous) relations for the latent representation of document. Connections in graphs are sparse, and higher order links between documents give an additional clue. Therefore, we have used multiple neighborhoods in different relation types in heterogeneous graph for representation of documents. However, different order neighborhood in each relation type has different importance which we have empirically validated also. Therefore, to properly utilize the different neighborhoods in relation type and importance of each relation type in the heterogeneous graph, we propose attention-based multi-dimensional multi-hop neighborhood-based graph convolution network for embedding that uses the two levels of an attention, namely, (i) relation level and (ii) neighborhood level, in each relation. A significant improvement over existing state-of-the-art methods in terms of various evaluation matrices has been obtained by the proposed approach.

References

[1]

Diego R. Amancio, Osvaldo N. Oliveira Jr, and Luciano da F. Costa. 2015. Topological-collaborative approach for disambiguating authors names in collaborative networks. Scientometrics 102, 1 (2015), 465–485.

Abstract

References

Cited By

Index Terms

Recommendations

Web personal name disambiguation based on reference entity tables mined from the web

Name Disambiguation Using Semantic Association Clustering

Author Name Disambiguation in Heterogeneous Academic Networks

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Full Text

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations