Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

TRAMP: understanding the behavior of schema mappings through provenance

Published: 01 September 2010 Publication History

Abstract

Though partially automated, developing schema mappings remains a complex and potentially error-prone task. In this paper, we present TRAMP (TRAnsformation Mapping Provenance), an extensive suite of tools supporting the debugging and tracing of schema mappings and transformation queries. TRAMP combines and extends data provenance with two novel notions, transformation provenance and mapping provenance, to explain the relationship between transformed data and those transformations and mappings that produced that data. In addition we provide query support for transformations, data, and all forms of provenance. We formally define transformation and mapping provenance, present an efficient implementation of both forms of provenance, and evaluate the resulting system through extensive experiments.

References

[1]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
[2]
P. Agrawal, O. Benjelloun, A. D. Sarma, C. Hayworth, S. Nabar, T. Sugihara, and J. Widom. An Introduction to ULDBs and the Trio System. IEEE Data Engineering Bulletin, 29(1):5--16, 2006.
[3]
B. Alexe, L. Chiticariu, R. Miller, and W. Tan. Muse: Mapping Understanding and Design by Example. In ICDE, pages 10--19, 2008.
[4]
B. Alexe, W. Tan, and Y. Velegrakis. STBenchmark: Towards a Benchmark for Mapping Systems. PVLDB, 1(1):230--244, 2008.
[5]
M. Blow, V. R. Borkar, M. J. Carey, C. Hillery, A. Kotopoulis, D. Lychagin, R. Preotiuc-Pietro, P. Reveliotis, J. Spiegel, and T. Westmann. Updates in the AquaLogic Data Services Platform. In ICDE, pages 1431--1442, 2009.
[6]
A. Chapman and H. Jagadish. Why Not? In SIGMOD, pages 523--534, 2009.
[7]
J. Cheney, L. Chiticariu, and W. Tan. Provenance in Databases: Why, How, and Where. Foundations and Trends in Databases, 1(4):379--474, 2009.
[8]
L. Chiticariu and W. Tan. Debugging Schema Mappings with Routes. In VLDB, pages 79--90, 2006.
[9]
Y. Cui and J. Widom. Lineage Tracing in a Data Warehousing System. In ICDE, page 683, 2000.
[10]
R. Fagin, L. M. Haas, M. A. Hernández, R. J. Miller, L. Popa, and Y. Velegrakis. Clio: Schema Mapping Creation and Data Exchange. Springer, 2009.
[11]
R. Fagin, P. Kolaitis, R. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. Theoretical Computer Science, 336(1):89--124, 2005.
[12]
A. Fuxman, M. Hernandez, H. Ho, R. Miller, P. Papotti, and L. Popa. Nested Mappings: Schema Mapping Reloaded. In VLDB, pages 67--78, 2006.
[13]
B. Glavic. Perm: Efficient Provenance Support for Relational Databases. PhD thesis, University of Zurich, 2010.
[14]
B. Glavic and G. Alonso. Perm: Processing Provenance and Data on the same Data Model through Query Rewriting. In ICDE, pages 174--185, 2009.
[15]
T. Green, G. Karvounarakis, and V. Tannen. Provenance Semirings. In PODS, pages 31--40, 2007.
[16]
M. Herschel, M. Hernández, and W. Tan. Artemis: A System for Analyzing Missing Answers. In VLDB, pages 1550--1553, 2009.
[17]
M. Lenzerini. Data Integration: A Theoretical Perspective. In PODS, pages 233--246, 2002.
[18]
G. Mecca, P. Papotti, and S. Raunich. Core schema mappings. In SIGMOD Conference, pages 655--668, 2009.
[19]
R. J. Miller, D. Fisla, M. Huang, D. Kymlicka, F. Ku, and V. Lee. The Amalgam Schema and Data Integration Test Suite, 2001. www.cs.toronto.edu/miller/amalgam.
[20]
R. J. Miller, L. M. Haas, and M. Hernández. Schema Mapping as Query Discovery. In VLDB, pages 77--88, 2000.
[21]
E. Rahm and P. Bernstein. A Survey of Approaches to Automatic Schema Matching. VLDB Journal, 10(4):334--350, 2001.
[22]
Y. L. Simmhan, B. Plale, and D. Gannon. A Survey of Data Provenance in e-Science. SIGMOD Rec., 34(3):31--36, 2005.
[23]
B. ten Cate, L. Chiticariu, P. G. Kolaitis, and W.-C. Tan. Laconic schema mappings: Computing the core with sql queries. PVLDB, 2(1):1006--1017, 2009.
[24]
J. Van den Bussche, S. Vansummeren, and G. Vossen. Towards Practical Meta-Querying. Information Systems, 30(4):317--332, 2005.
[25]
Y. Velegrakis, R. Miller, and J. Mylopoulos. Representing and Querying Data Transformations. In ICDE, pages 81--92, 2005.
[26]
L. Yan, R. Miller, L. Haas, and R. Fagin. Data-driven Understanding and Refinement of Schema Mappings. In SIGMOD, pages 485--496, 2001.

Cited By

View all
  • (2023)InteractivityNatural Language Interfaces to Databases10.1007/978-3-031-45043-3_7(177-229)Online publication date: 25-Nov-2023
  • (2022)Provenance Framework for Multi-Depth Querying Using Zero-Information Loss DatabaseInternational Journal of Information Technology & Decision Making10.1142/S021962202250084522:05(1693-1742)Online publication date: 30-Nov-2022
  • (2020)Improving data scientist efficiency with provenanceProceedings of the ACM/IEEE 42nd International Conference on Software Engineering10.1145/3377811.3380366(1086-1097)Online publication date: 27-Jun-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
September 2010
1658 pages

Publisher

VLDB Endowment

Publication History

Published: 01 September 2010
Published in PVLDB Volume 3, Issue 1-2

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)InteractivityNatural Language Interfaces to Databases10.1007/978-3-031-45043-3_7(177-229)Online publication date: 25-Nov-2023
  • (2022)Provenance Framework for Multi-Depth Querying Using Zero-Information Loss DatabaseInternational Journal of Information Technology & Decision Making10.1142/S021962202250084522:05(1693-1742)Online publication date: 30-Nov-2022
  • (2020)Improving data scientist efficiency with provenanceProceedings of the ACM/IEEE 42nd International Conference on Software Engineering10.1145/3377811.3380366(1086-1097)Online publication date: 27-Jun-2020
  • (2019)Interactive Mapping Specification with Exemplar TuplesACM Transactions on Database Systems10.1145/332148544:3(1-44)Online publication date: 5-Jun-2019
  • (2019)Hypothetical Reasoning via Provenance AbstractionProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3300084(537-554)Online publication date: 25-Jun-2019
  • (2018)Efficient provenance tracking for datalog using top-k queriesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-018-0496-727:2(245-269)Online publication date: 1-Apr-2018
  • (2018)Adding data provenance support to Apache SparkThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-017-0474-527:5(595-615)Online publication date: 1-Oct-2018
  • (2017)Interactive Mapping Specification with Exemplar TuplesProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3064028(667-682)Online publication date: 9-May-2017
  • (2017)A survey on provenanceThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-017-0486-126:6(881-906)Online publication date: 1-Dec-2017
  • (2016)Extending the Kernel of a Relational DBMS with Comprehensive Support for Sequenced Temporal QueriesACM Transactions on Database Systems10.1145/296760841:4(1-46)Online publication date: 2-Nov-2016
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media