Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1858158.1858163acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdmsnConference Proceedingsconference-collections
research-article

Facilitating fine grained data provenance using temporal data model

Published: 13 September 2010 Publication History

Abstract

E-science applications use fine grained data provenance to maintain the reproducibility of scientific results, i.e., for each processed data tuple, the source data used to process the tuple as well as the used approach is documented. Since most of the e-science applications perform on-line processing of sensor data using overlapping time windows, the overhead of maintaining fine grained data provenance is huge especially in longer data processing chains. This is because data items are used by many time windows. In this paper, we propose an approach to reduce storage costs for achieving fine grained data provenance by maintaining data provenance on the relation level instead on the tuple level and make the content of the used database reproducible. The approach has prototypically been implemented for streaming and manually sampled data.

References

[1]
}}D. Brus and M. Knotters. Sampling design for compliance monitoring of surface water quality: A case study in a polder area. Water Resources Research, 44(11):95--102, 2008.
[2]
}}P. Buneman, S. Khanna, and T. Wang-Chiew. Data provenance: Some basic issues. Foundations of Software Technology and Theoretical Computer Science, pages 87--93, 2000.
[3]
}}P. Buneman, S. Khanna, and T. Wang-Chiew. Why and where: A characterization of data provenance. Database Theory - ICDT 2001, pages 316--330.
[4]
}}P. Buneman and T. Wang-Chiew. Provenance in databases. In Proc. Intl. Conf. on Management of data, pages 1171--1173, New York, NY, USA, 2007. ACM.
[5]
}}Y. Cui and J. Widom. Lineage tracing for general data warehouse transformations. The VLDB Journal, vol. 12, pages 41--58.
[6]
}}J. de Gruijter, D. Brus, M. Bierkens, and M. Knotters. Sampling for natural resource monitoring. Springer Verlag, 2006.
[7]
}}J. Futrelle. Tupelo Server. Website. http://tupeloproject.ncsa.uiuc.edu/.
[8]
}}J. Ledlie, C. Ng, D. A. Holland, K. kumar Muniswamy-reddy, U. Braun, and M. Seltzer. Provenance-aware sensor data storage. In Workshop on Networking Meets Databases (NetDB), 2005.
[9]
}}A. Sarma, M. Theobald, and J. Widom. LIVE: A Lineage-Supported Versioned DBMS. In Proc. Intl. Conf. on Scientific and Statistical Database Management, 2010.
[10]
}}M. Szomszor and L. Moreau. Recording and reasoning over data provenance in web and grid services. In On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE, pages 603--620.
[11]
}}C. K. University and C. Koncilia. A bi-temporal data warehouse model. In Proc. Intl. Conf. on Advanced Information Systems Engineering, pages 77--80, 2003.
[12]
}}N. N. Vijayakumar and B. Plale. Towards low overhead provenance tracking in near real-time stream filtering. In Provenance and Annotation of Data, pages 46--54, 2006.

Cited By

View all
  • (2016)Provenance in Web Feed Mash-Up SystemsInternational Journal of Information Technology and Web Engineering10.4018/IJITWE.201610010311:4(43-62)Online publication date: 1-Oct-2016
  • (2013)AriadneProceedings of the 7th ACM international conference on Distributed event-based systems10.1145/2488222.2488256(39-50)Online publication date: 29-Jun-2013
  • (2013)An Inference-Based Framework to Manage Data Provenance in Geoscience ApplicationsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2013.224776951:11(5113-5130)Online publication date: Nov-2013
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DMSN '10: Proceedings of the Seventh International Workshop on Data Management for Sensor Networks
September 2010
45 pages
ISBN:9781450304160
DOI:10.1145/1858158
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • CONET

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. e-science applications
  2. fine grained data provenance
  3. sensor data
  4. temporal data model

Qualifiers

  • Research-article

Conference

DMSN '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6 of 16 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2016)Provenance in Web Feed Mash-Up SystemsInternational Journal of Information Technology and Web Engineering10.4018/IJITWE.201610010311:4(43-62)Online publication date: 1-Oct-2016
  • (2013)AriadneProceedings of the 7th ACM international conference on Distributed event-based systems10.1145/2488222.2488256(39-50)Online publication date: 29-Jun-2013
  • (2013)An Inference-Based Framework to Manage Data Provenance in Geoscience ApplicationsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2013.224776951:11(5113-5130)Online publication date: Nov-2013
  • (2013)An on-the-fly provenance tracking mechanism for stream processing systems2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS)10.1109/ICIS.2013.6607885(475-481)Online publication date: Jun-2013
  • (2013)Towards Automatic Capturing of Semi-structured Process ProvenanceData-Driven Process Discovery and Analysis10.1007/978-3-642-40919-6_5(84-99)Online publication date: 2013
  • (2012)From scripts towards provenance inferenceProceedings of the 2012 IEEE 8th International Conference on E-Science (e-Science)10.1109/eScience.2012.6404467(1-8)Online publication date: 8-Oct-2012
  • (2012)Towards integrating workflow and database provenanceProceedings of the 4th international conference on Provenance and Annotation of Data and Processes10.1007/978-3-642-34222-6_2(11-23)Online publication date: 19-Jun-2012
  • (2012)Probabilistic Inference of Fine-Grained Data ProvenanceDatabase and Expert Systems Applications10.1007/978-3-642-32600-4_22(296-310)Online publication date: 2012
  • (2012)Fine-grained provenance inference for a large processing chain with non-materialized intermediate viewsProceedings of the 24th international conference on Scientific and Statistical Database Management10.1007/978-3-642-31235-9_26(397-405)Online publication date: 25-Jun-2012
  • (2011)Inferring fine-grained data provenance in stream data processingProceedings of the 22nd international conference on Database and expert systems applications - Volume Part II10.5555/2033546.2033560(118-127)Online publication date: 29-Aug-2011
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media