Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/645482.653450guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Supporting Fine-grained Data Lineage in a Database Visualization Environment

Published: 07 April 1997 Publication History
  • Get Citation Alerts
  • Abstract

    The lineage of a datum records its processing history. Because such information can be used to trace the source of anomalies and errors in processed data sets, it is valuable to users for a variety of applications, including the investigation of anomalies and debugging. Traditional data lineage approaches rely on metadata. However, metadata does not scale well to fine-grained lineage, especially in large data sets. For example, it is not feasible to store all of the information that is necessary to trace from a specific floating-point value in a processed data set to a particular satellite image pixel in a source data set. In this paper, we propose a novel method to support fine-grained data lineage. Rather than relying on metadata, our approach lazily computes the lineage using a limited amount of information about the processing operators and the base data. We introduce the notions of weak inversion and verification. While our system does not perfectly invert the data, it uses weak inversion and verification to provide a number of guarantees about the lineage it generates. We propose a design for the implementation of weak inversion and verification in an object-relational database management system.

    Cited By

    View all

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ICDE '97: Proceedings of the Thirteenth International Conference on Data Engineering
    April 1997
    542 pages
    ISBN:0818678070

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 07 April 1997

    Author Tags

    1. anomalies
    2. base data
    3. data processing history
    4. data visualisation
    5. database visualization environment
    6. debugging
    7. error sources
    8. fine-grained data lineage
    9. large data sets
    10. lazy algorithm
    11. limited information
    12. lineage guarantees
    13. metadata
    14. object-relational database management system
    15. processed data sets
    16. processing operators
    17. tracing
    18. verification
    19. weak inversion

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Inspector gadgetProceedings of the VLDB Endowment10.14778/3402755.34027584:12(1237-1248)Online publication date: 3-Jun-2020
    • (2019)Data ProvenanceACM SIGMOD Record10.1145/3316416.331641847:3(5-16)Online publication date: 27-Feb-2019
    • (2018)Debugging Distributed Systems with Why-Across-Time ProvenanceProceedings of the ACM Symposium on Cloud Computing10.1145/3267809.3267839(333-346)Online publication date: 11-Oct-2018
    • (2018)Provenance for Interactive VisualizationsProceedings of the Workshop on Human-In-the-Loop Data Analytics10.1145/3209900.3209904(1-8)Online publication date: 10-Jun-2018
    • (2018)Provenance and Probabilities in Relational DatabasesACM SIGMOD Record10.1145/3186549.318655146:4(5-15)Online publication date: 22-Feb-2018
    • (2018)A systematic review of provenance systemsKnowledge and Information Systems10.1007/s10115-018-1164-357:3(495-543)Online publication date: 1-Dec-2018
    • (2017)Diagnosing Machine Learning Pipelines with Fine-grained LineageProceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3078597.3078603(143-153)Online publication date: 26-Jun-2017
    • (2017)Distributed Provenance CompressionProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3035926(203-218)Online publication date: 9-May-2017
    • (2017)Model provenance tracking and inference for integrated environmental modellingEnvironmental Modelling & Software10.1016/j.envsoft.2017.06.05196:C(95-105)Online publication date: 1-Oct-2017
    • (2017)A survey on provenanceThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-017-0486-126:6(881-906)Online publication date: 1-Dec-2017
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media