Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3360901.3364440acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
short-paper

Contextual Entity Disambiguation in Domains with Weak Identity Criteria: Disambiguating Golden Age Amsterdamers

Published: 23 September 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Entity disambiguation is a widely investigated topic, and many matching algorithms have been proposed. However, this task has not yet been satisfactorily addressed when the domain of interest provides poor or incomplete data with little discriminating power. In these cases, the use of content fields such as name and date is not enough and the simple use of relations with other entities is not of much help when these related entities also need disambiguation before they can be used. Therefore, we propose an approach for the disambiguation of clustered resources using context (related entities that are also clustered) as evidence for reconciling matched entities. We test the proposed method on datasets of historical records from Amsterdam in the 17th century for which context is available, and we compare the results of the proposed approach to a gold standard generated by three experts, which we make available online. The results show that the proposed approach manages to meaningfully use context for isolating identity sub-clusters with higher quality by eliminating potentially false positive matches.

    References

    [1]
    Manel Achichi, Zohra Bellahsene, and Konstantin Todorov. 2017. Legato results for OAEI 2017. In OM@ ISWC. 146--152.
    [2]
    Xin Dong, Alon Halevy, and Jayant Madhavan. 2005. Reference reconciliation in complex information spaces. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. ACM, 85--96.
    [3]
    Daniel Faria, Booma S Balasubramani, Vivek R Shivaprabhu, Isabela Mott, Catia Pesquita, Francisco M Couto, and Isabel F Cruz. 2017. Results of AML in OAEI 2017. In OM@ ISWC. 122--128.
    [4]
    Al Idrissou, Frank van Harmelen, and Peter van den Besselaar. 2019. Network Metrics for Assessing the Quality of Entity Resolution Between Multiple Datasets (Invited submission as extension of citeIdrissouEKAW2018, Under Revision, Open Access) . Semantic Web Journal (2019). http://semantic-web-journal.org/content/network-metrics-assessing-quality-entity-resolution-between-multiple-datasets
    [5]
    Al Koudous Idrissou, Frank van Harmelen, and Peter van den Besselaar. 2018. Network Metrics for Assessing the Quality of Entity Resolution Between Multiple Datasets. In Knowledge Engineering and Knowledge Management, Catherine Faron Zucker, Chiara Ghidini, Amedeo Napoli, and Yannick Toussaint (Eds.). Springer International Publishing, Cham, 147--162.
    [6]
    Ernesto Jiménez-Ruiz, Bernardo Cuenca Grau, and Valerie Cross. 2017. LogMap family participation in the OAEI 2017. In CEUR Workshop Proceedings .
    [7]
    Abderrahmane Khiat and Maximilian Mackeprang. 2017. I-Match and OntoIdea results for OAEI 2017. In OM@ ISWC. 135--137.
    [8]
    Xinze Lyu, Qingheng Zhang, Wei Hu, Zequn Sun, and Yuzhong Qu. 2017. njuLink: results for instance matching at OAEI 2017. In OM@ ISWC. 158--165.
    [9]
    Joe Raad, Wouter Beek, Frank Van Harmelen, Nathalie Pernelle, and Fatiha Sa"is. 2018. Detecting Erroneous Identity Links on the Web using Network Metrics. In ISWC. Springer, 391--407.
    [10]
    Hossein Rahmani, Bijan Ranjbar-Sahraei, Gerhard Weiss, and Karl Tuyls. 2016. Entity resolution in disjoint graphs: an application on genealogical data. Intelligent Data Analysis, Vol. 20, 2 (2016), 455--475.
    [11]
    Shu Rong, Xing Niu, Evan Wei Xiang, Haofen Wang, Qiang Yang, and Yong Yu. 2012. A machine learning approach for instance matching based on similarity metrics. In International Semantic Web Conference. Springer, 460--475.
    [12]
    Alieh Saeedi, Eric Peukert, and Erhard Rahm. 2018. Using Link Features for Entity Clustering in Knowledge Graphs. In ESWC. Springer, 576--592.

    Cited By

    View all
    • (2022)Documenting the Creation, Manipulation and Evaluation of Links for Reuse and ReproducibilityKnowledge Engineering and Knowledge Management10.1007/978-3-031-17105-5_6(81-96)Online publication date: 26-Sep-2022

    Index Terms

    1. Contextual Entity Disambiguation in Domains with Weak Identity Criteria: Disambiguating Golden Age Amsterdamers

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        K-CAP '19: Proceedings of the 10th International Conference on Knowledge Capture
        September 2019
        281 pages
        ISBN:9781450370080
        DOI:10.1145/3360901
        • General Chairs:
        • Mayank Kejriwal,
        • Pedro Szekely,
        • Program Chair:
        • Raphaël Troncy
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 23 September 2019

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. data integration
        2. entity disambiguation
        3. entity reconciliation
        4. entity resolution
        5. linked data

        Qualifiers

        • Short-paper

        Funding Sources

        • NWO Large 2015-2016
        • RISIS

        Conference

        K-CAP '19
        Sponsor:
        K-CAP '19: Knowledge Capture Conference
        November 19 - 21, 2019
        CA, Marina Del Rey, USA

        Acceptance Rates

        Overall Acceptance Rate 55 of 198 submissions, 28%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)4
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 10 Aug 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2022)Documenting the Creation, Manipulation and Evaluation of Links for Reuse and ReproducibilityKnowledge Engineering and Knowledge Management10.1007/978-3-031-17105-5_6(81-96)Online publication date: 26-Sep-2022

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media