Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations

Published: 01 September 2010 Publication History

Abstract

Name ambiguity in the context of bibliographic citations is a difficult problem which, despite the many efforts from the research community, still has a lot of room for improvement. In this article, we present a heuristic-based hierarchical clustering method to deal with this problem. The method successively fuses clusters of citations of similar author names based on several heuristics and similarity measures on the components of the citations (e.g., coauthor names, work title, and publication venue title). During the disambiguation task, the information about fused clusters is aggregated providing more information for the next round of fusion. In order to demonstrate the effectiveness of our method, we ran a series of experiments in two different collections extracted from real-world digital libraries and compared it, under two metrics, with four representative methods described in the literature. We present comparisons of results using each considered attribute separately (i.e., coauthor names, work title, and publication venue title) with the author name attribute and using all attributes together. These results show that our unsupervised method, when using all attributes, performs competitively against all other methods, under both metrics, loosing only in one case against a supervised method, whose result was very close to ours. Moreover, such results are achieved without the burden of any training and without using any privileged information such as knowing a priori the correct number of clusters. © 2010 Wiley Periodicals, Inc.

Cited By

View all
  • (2023)Web-Scale Academic Name Disambiguation: The WhoIsWho Benchmark, Leaderboard, and ToolkitProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599930(3817-3828)Online publication date: 6-Aug-2023
  • (2022)Exploiting Higher Order Multi-dimensional Relationships with Self-attention for Author Name DisambiguationACM Transactions on Knowledge Discovery from Data10.1145/350273016:5(1-23)Online publication date: 9-Mar-2022
  • (2022)A knowledge graph embeddings based approach for author name disambiguation using literalsScientometrics10.1007/s11192-022-04426-2127:8(4887-4912)Online publication date: 1-Aug-2022
  • Show More Cited By
  1. An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Journal of the American Society for Information Science and Technology
      Journal of the American Society for Information Science and Technology  Volume 61, Issue 9
      September 2010
      215 pages

      Publisher

      John Wiley & Sons, Inc.

      United States

      Publication History

      Published: 01 September 2010

      Author Tags

      1. automatic classification
      2. bibliographic citations
      3. disambiguation
      4. heuristics
      5. proper names

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 09 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Web-Scale Academic Name Disambiguation: The WhoIsWho Benchmark, Leaderboard, and ToolkitProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599930(3817-3828)Online publication date: 6-Aug-2023
      • (2022)Exploiting Higher Order Multi-dimensional Relationships with Self-attention for Author Name DisambiguationACM Transactions on Knowledge Discovery from Data10.1145/350273016:5(1-23)Online publication date: 9-Mar-2022
      • (2022)A knowledge graph embeddings based approach for author name disambiguation using literalsScientometrics10.1007/s11192-022-04426-2127:8(4887-4912)Online publication date: 1-Aug-2022
      • (2022)Completing features for author name disambiguation (AND): an empirical analysisScientometrics10.1007/s11192-021-04229-x127:2(1039-1063)Online publication date: 1-Feb-2022
      • (2021)Exploiting similarities across multiple dimensions for author name disambiguationScientometrics10.1007/s11192-021-04101-y126:9(7525-7560)Online publication date: 1-Sep-2021
      • (2021)Multilayer heuristics based clustering framework (MHCF) for author name disambiguationScientometrics10.1007/s11192-021-04087-7126:9(7637-7678)Online publication date: 1-Sep-2021
      • (2021)Ethnicity‐based name partitioning for author name disambiguation using supervised machine learningJournal of the Association for Information Science and Technology10.1002/asi.2445972:8(979-994)Online publication date: 5-Jul-2021
      • (2020)Collecting large-scale publication data at the level of individual researchers: a practical proposal for author name disambiguationScientometrics10.1007/s11192-020-03410-y123:2(883-907)Online publication date: 1-May-2020
      • (2020)Effect of forename string on author name disambiguationJournal of the Association for Information Science and Technology10.1002/asi.2429871:7(839-855)Online publication date: 8-Jun-2020
      • (2019)A fast and integrative algorithm for clustering performance evaluation in author name disambiguationScientometrics10.1007/s11192-019-03143-7120:2(661-681)Online publication date: 1-Aug-2019
      • Show More Cited By

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media