Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1353343.1353358acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article
Free access

Schema mapping verification: the spicy way

Published: 25 March 2008 Publication History
  • Get Citation Alerts
  • Abstract

    Schema mapping algorithms rely on value correspondences - i.e., correspondences among semantically related attributes - to produce complex transformations among data sources. These correspondences are either manually specified or suggested by separate modules called schema matchers. The quality of mappings produced by a mapping generation tool strongly depends on the quality of the input correspondences. In this paper, we introduce the Spicy system, a novel approach to the problem of verifying the quality of mappings. Spicy is based on a three-layer architecture, in which a schema matching module is used to provide input to a mapping generation module. Then, a third module, the mapping verification module, is used to check candidate mappings and choose the ones that represent better transformations of the source into the target. At the core of the system stands a new technique for comparing the structure and actual content of trees, called structural analysis. Experimental results show that, by carefully designing the comparison algorithm, it is possible to achieve both good scalability and high precision in mapping selection.

    References

    [1]
    The Ontology Alignment Evaluation Initiative - 2007. http://oaei.ontologymatching.org/2007/.
    [2]
    S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
    [3]
    V. V. Anshelevich. A Hierarchical Approach to Computer Hex. Artif. Intell., 134(1--2):101--120, 2002.
    [4]
    D. Aumueller, H. Do, Massmann S., and E. Rahm. Schema and Ontology Matching with COMA++. In Proc. of ACM SIGMOD, pages 906--908, 2005.
    [5]
    P. A. Bernstein and S. Melnik. Model management 2.0: Manipulating richer mappings. In Proc. of ACM SIGMOD, pages 1--12, 2007.
    [6]
    A. Bilke and F. Naumann. Schema Matching using Duplicates. In Proc. of ICDE, pages 69--80, 2005.
    [7]
    P. Bohannon, E. Elnahrawy, W. Fan, and M. Flaster. Putting Context into Schema Matching. In Proc. of VLDB, pages 307--318. VLDB Endowment, 2006.
    [8]
    A. Bonifati, E. Q. Chang, T. Ho, L. Lakshmanan, and R. Pottinger. HePToX: Marrying XML and Heterogeneity in Your P2P Databases. In Proc. of VLDB, pages 1267--1270, 2005.
    [9]
    L. Chiticariu and W. C. Tan. Debugging Schema Mappings with Routes. In Proc. of VLDB, pages 79--90, 2006.
    [10]
    P. R. Clayton. Fundamentals of Electric Circuit Analysis. John Wiley & Sons, 2001.
    [11]
    R. Dhamankar, Y. Lee, A. H. Doan, A. Halevy, and P. Domingos. iMAP: Discovering Complex Semantic Matches between Database Schemas. In Proc. of ACM SIGMOD, pages 383--394, 2004.
    [12]
    H. H. Do, S. Melnik, and E. Rahm. Comparison of Schema Matching Evaluations. In Proc. of the 2nd GI Workshop on Web Databases, pages 221--237, 2002.
    [13]
    H. H. Do and E. Rahm. COMA - A System for Flexible Combination of Schema Matching Approaches. In Proc. of VLDB, pages 610--621, 2002.
    [14]
    A. H. Doan, P. Domingos, and A. Halevy. Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach. In Proc. of ACM SIGMOD, pages 509--520, 2001.
    [15]
    P. G. Doyle and J. L. Snell. Random Walks and Electric Networks. In Proc. of the Mathematical Associations of America, 1984.
    [16]
    C. Faloutsos. Indexing multimedia databases. In Proc. of ACM SIGMOD, page 467, New York, NY, USA, 1995. ACM Press.
    [17]
    A. Fuxman, M. A. Hernández, C. T. Howard, R. J. Miller, P. Papotti, and L. Popa. Nested Mappings: Schema Mapping Reloaded. In Proc. of VLDB, pages 67--78, 2006.
    [18]
    A. Gal. Managing Uncertainty in Schema Matching with Top-K Schema Mappings. J. of Data Semantics, VI:90--114, 2006.
    [19]
    A. Gal. Why is Schema Matching Tough and What We Can Do About It. Sigmod Record, 35(4):2--5, 2006.
    [20]
    A. Gal. The Generation Y of XML Schema Matching (Panel Description). In Proceedings of XML Database Symposium, pages 137--139, 2007.
    [21]
    L. M. Haas, M. A. Hernández, H. Ho, L. Popa, and M. Roth. Clio Grows Up: from Research Prototype to Industrial Tool. In Proc. of ACM SIGMOD, pages 805--810, 2005.
    [22]
    J. Kang and J. F. Naughton. On Schema Matching with Opaque Column Names and Data Values. In Proc. of ACM SIGMOD, pages 205--216, 2003.
    [23]
    W. S. Li and C. Clifton. SEMINT: A Tool for Identifying Attribute Correspondences in Heterogeneous Databases using Neural Networks. Data and Know. Eng., 33(1):49--84, 2000.
    [24]
    S. Melnik, H. Garcia-Molina, and E. Rahm. Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching. In Proc. of ICDE, pages 117--128, 2002.
    [25]
    R. J. Miller, L. M. Haas, and M. A. Hernandez. Schema Mapping as Query Discovery. In Proc. of VLDB, pages 77--99, 2000.
    [26]
    T. Milo and S. Zohar. Using Schema Matching to Simplify Heterogeneous Data Translation. In Proc. of VLDB, pages 122--133, 1998.
    [27]
    F. Naumann, C.-T. Ho, X. Tian, L. M. Haas, and N. Megiddo. Attribute Classification Using Feature Analysis. In Proc. of ICDE, page 271, 2002.
    [28]
    C. R. Palmer and C. Faloutsos. Electricity-Based External Similarity of Categorical Attributes. In Proc. of PAKDD, pages 486--500, 2003.
    [29]
    R. Pierce. An Introduction to Information Theory. Dover Publications, 1980.
    [30]
    L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernandez, and R. Fagin. Translating Web Data. In Proc. of VLDB, pages 598--609, 2002.
    [31]
    E. Rahm and P. A. Bernstein. A Survey of Approaches to Automatic Schema Matching. VLDB J., 10:334--350, 2001.
    [32]
    P. Shvaiko and J. Euzenat. A Survey of Schema Based Matching Approaches. J. of Data Semantics, IV - LNCS 3730:146--171, 2005.
    [33]
    W. Su, J. Wang, and F. Lochovsky. Holistic Schema Matching for Web Query Interfaces. In Proc. of EDBT, pages 77--94, 2006.
    [34]
    L. L. Yan, R. J. Miller, L. M. Haas, and R. Fagin. Data Driven Understanding and Refinement of Schema Mappings. In Proc. of ACM SIGMOD, pages 485--496, 2001.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    EDBT '08: Proceedings of the 11th international conference on Extending database technology: Advances in database technology
    March 2008
    762 pages
    ISBN:9781595939265
    DOI:10.1145/1353343
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 March 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Conference

    EDBT '08

    Acceptance Rates

    Overall Acceptance Rate 7 of 10 submissions, 70%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)53
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Reasoning on property graphs with graph generating dependenciesInformation Sciences: an International Journal10.1016/j.ins.2024.120675672:COnline publication date: 1-Jun-2024
    • (2020)Knowledge translationProceedings of the VLDB Endowment10.14778/3407790.340780613:12(2018-2032)Online publication date: 14-Sep-2020
    • (2019)DynamapProceedings of the 31st International Conference on Scientific and Statistical Database Management10.1145/3335783.3335785(37-48)Online publication date: 23-Jul-2019
    • (2017)UFeedProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3132887(187-196)Online publication date: 6-Nov-2017
    • (2016)Efficient Feedback Collection for Pay-as-you-go Source SelectionProceedings of the 28th International Conference on Scientific and Statistical Database Management10.1145/2949689.2949690(1-12)Online publication date: 18-Jul-2016
    • (2016)Pay-as-you-go Data IntegrationProceedings of the 42nd International Conference on SOFSEM 2016: Theory and Practice of Computer Science - Volume 958710.1007/978-3-662-49192-8_7(81-92)Online publication date: 23-Jan-2016
    • (2015)Enabling community-driven information integration through clusteringDistributed and Parallel Databases10.1007/s10619-014-7160-z33:1(33-67)Online publication date: 1-Mar-2015
    • (2014)A Methodology and Architecture Embedding Quality Assessment in Data IntegrationJournal of Data and Information Quality10.1145/25676634:4(1-40)Online publication date: 1-May-2014
    • (2014)Improving Clustering-Based Schema Matching Using Latent Semantic IndexingTransactions on Large-Scale Data- and Knowledge-Centered Systems XV10.1007/978-3-662-45761-0_4(102-123)Online publication date: 12-Dec-2014
    • (2014)Query Reformulation in PDMS Based on Social RelevanceTransactions on Large-Scale Data- and Knowledge-Centered Systems XIII10.1007/978-3-642-54426-2_3(59-90)Online publication date: 5-Mar-2014
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media