Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3360901.3364442acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
research-article

On the Impact of sameAs on Schema Matching

Published: 23 September 2019 Publication History

Abstract

In a large and decentralised knowledge representation system such as the Web of Data, it is common for data sets to overlap. In the absence of a central naming authority, semantic heterogeneity is inevitable as such overlapping contents are described using different schemas. To overcome this problem, a number of solutions have automated the integration of these data sets by matching their schemas. In this work, we focus on a specific category of these solutions that relies on the concepts' extension for matching the schemas (i.e., instance-based methods). Rather than introducing a new approach for the task of schema matching, this work studies the impact of exploiting the semantics of owl:sameAs in such instance-based methods. For this empirical analysis, we investigate more than 900K concepts extracted from the Web, and make use of over 35B implicit identity assertions to study their impact. The experiments show that despite the growing doubts over their quality, exploiting owl:sameAs assertions extracted from the Web can improve instance-based schema matching techniques.

References

[1]
Wouter Beek, Joe Raad, Jan Wielemaker, and Frank van Harmelen. 2018. sameAs. cc: The Closure of 500M owl: sameAs Statements. In Extended Semantic Web Conference . Springer, 65--80.
[2]
Wouter Beek, Laurens Rietveld, Hamid R Bazoobandi, Jan Wielemaker, and Stefan Schlobach. 2014. LOD laundromat: a uniform way of publishing other people's dirty data. In International Semantic Web Conference. Springer, 213--228.
[3]
Gianluca Correndo, Antonio Penta, Nicholas Gibbins, and Nigel Shadbolt. 2012. Statistical analysis of the owl: sameAs network for aligning concepts in the linking open data cloud. In International Conference on Database and Expert Systems Applications. Springer, 215--230.
[4]
AnHai Doan, Jayant Madhavan, Pedro Domingos, and Alon Halevy. 2004. Ontology matching: A machine learning approach. Handbook on ontologies . Springer, 385--403.
[5]
Jérôme Euzenat and Pavel Shvaiko. 2013. Ontology Matching, Second Edition.
[6]
Javier D Fernández, Wouter Beek, Miguel A Mart'inez-Prieto, and Mario Arias. 2017. LOD-a-lot. In International Semantic Web Conference. Springer, 75--83.
[7]
Hugh Glaser, Afraz Jaffri, and Ian Millard. 2009. Managing Co-reference on the Semantic Web. In Proceedings of the WWW Workshop on Linked Data on the Web, LDOW.
[8]
Harry Halpin, Patrick J Hayes, James P McCusker, Deborah L McGuinness, and Henry S Thompson. 2010. When owl:sameAs isn't the same: An analysis of identity in Linked Data. In International Semantic Web Conference. Springer, 305--320.
[9]
Pascal Hitzler, Markus Krotzsch, and Sebastian Rudolph. 2009. Foundations of semantic web technologies .Chapman and Hall/CRC.
[10]
Aidan Hogan, Antoine Zimmermann, Jürgen Umbrich, Axel Polleres, and Stefan Decker. 2012. Scalable and distributed methods for entity matching, consolidation and disambiguation over linked data corpora. Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 10 (2012), 76--110.
[11]
Antoine Isaac, Lourens Van Der Meij, Stefan Schlobach, and Shenghui Wang. 2007. An empirical study of instance-based ontology matching. The Semantic Web . Springer, 253--266.
[12]
Afraz Jaffri, Hugh Glaser, and Ian Millard. 2008. URI Disambiguation in the Context of Linked Data. In WWW Workshop on Linked Data on the Web, LDOW.
[13]
Michael Levandowsky and David Winter. 1971. Distance between sets. Nature, Vol. 234, 5323 (1971), 34.
[14]
Michalis Mountantonakis and Yannis Tzitzikas. 2016. On measuring the lattice of commonalities among several linked datasets. Proceedings of the VLDB Endowment, Vol. 9, 12 (2016), 1101--1112.
[15]
Andriy Nikolov, Victoria Uren, Enrico Motta, and Anne De Roeck. 2009. Overcoming schema heterogeneity between linked semantic repositories to improve coreference resolution. In Asian Semantic Web Conference. Springer, 332--346.
[16]
Joe Raad. 2018. Identity Management in Knowledge Graphs (doctoral dissertation). University of Paris-Saclay.
[17]
Joe Raad, Wouter Beek, Frank van Harmelen, Nathalie Pernelle, and Fatiha Sa"is. 2018. Detecting Erroneous Identity Links on the Web Using Network Metrics. In International Semantic Web Conference. Springer, 391--407.
[18]
Joe Raad, Wouter Beek, Frank van Harmelen, Jan Wielemaker, Nathalie Pernelle, and Fatiha Sa"is. 2019 a. Constructing and Cleaning Identity Graphs in the LOD Cloud. Data Intelligence (2019).
[19]
Joe Raad, Nathalie Pernelle, Fatiha Sa"is, Wouter Beek, and Frank van Harmelen. 2019 b. The sameAs Problem: A Survey on Identity Management in the Web of Data. arXiv preprint arXiv:1907.10528 (2019).
[20]
Gerd Stumme and Alexander Maedche. 2001. FCA-Merge: Bottom-up merging of ontologies. In IJCAI, Vol. 1. 225--230.
[21]
Fabian M Suchanek, Serge Abiteboul, and Pierre Senellart. 2011. Paris: Probabilistic alignment of relations, instances, and schema. Proceedings of the VLDB Endowment, Vol. 5, 3 (2011), 157--168.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
K-CAP '19: Proceedings of the 10th International Conference on Knowledge Capture
September 2019
281 pages
ISBN:9781450370080
DOI:10.1145/3360901
  • General Chairs:
  • Mayank Kejriwal,
  • Pedro Szekely,
  • Program Chair:
  • Raphaël Troncy
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 September 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. identity
  2. linked open data
  3. schema matching

Qualifiers

  • Research-article

Conference

K-CAP '19
Sponsor:
K-CAP '19: Knowledge Capture Conference
November 19 - 21, 2019
CA, Marina Del Rey, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 110
    Total Downloads
  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media