Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1783534.1783549guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Provenance as dependency analysis

Published: 23 September 2007 Publication History

Abstract

Provenance is information recording the source, derivation, or history of some information. Provenance tracking has been studied in a variety of settings; however, although many design points have been explored, the mathematical or semantic foundations of data provenance have received comparatively little attention. In this paper, we argue that dependency analysis techniques familiar from program analysis and program slicing provide a formal foundation for forms of provenance that are intended to showhow(part of) the output of a query depends on (parts of) its input. We introduce a semantic characterization of such dependency provenance, show that this form of provenance is not computable, and provide dynamic and static approximation techniques.

References

[1]
Abadi, M., Banerjee, A., Heintze, N., Riecke, J.G.: A core calculus of dependency. In: POPL, pp. 147-160. ACM Press, New York (1999).
[2]
Abadi, M., Lampson, B., Lévy, J.-J.: Analysis and caching of dependencies. In: ICFP, pp. 83-91. ACM Press, New York (1996).
[3]
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995).
[4]
Acar, U.A., Blelloch, G.E., Harper, R.: Selective memoization. In: Proceedings of the 30th Annual ACM Symposium on Principles of Programming Languages, ACM Press, New York (2003).
[5]
Benjelloun, O., Sarma, A.D., Halevy, A.Y., Widom, J.: ULDBs: Databases with uncertainty and lineage. In: VLDB, pp. 953-964 (2006).
[6]
Bhagwat, D., Chiticariu, L., Tan, W.-C., Vijayvargiya, G.: An annotation management system for relational databases. VLDB Journal 14(4), 373-396 (2005).
[7]
Biswas, S.: Dynamic Slicing in Higher-Order Programming Languages. PhD thesis, University of Pennsylvania (1997).
[8]
Bose, R., Frew, J.: Lineage retrieval for scientific data processing: a survey. ACM Comput. Surv. 37(1), 1-28 (2005).
[9]
Buneman, P., Chapman, A., Cheney, J.: Provenance management in curated databases. In: SIGMOD 2006, pp. 539-550 (2006).
[10]
Buneman, P., Cheney, J., Vansummeren, S.: On the expressiveness of implicit provenance in query and update languages. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 209-223. Springer, Heidelberg (2006).
[11]
Buneman, P., Khanna, S., Tan, W.-C.: Why and where: A characterization of data provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316-330. Springer, Heidelberg (2000).
[12]
Buneman, P., Khanna, S., Tan, W.-C.: On propagation of deletions and annotations through views. In: PODS, pp. 150-158 (2002).
[13]
Buneman, P., Naqvi, S.A., Tannen, V., Wong, L.: Principles of programming with complex objects and collection types. Theor. Comp. Sci. 149(1), 3-48 (1995).
[14]
Cheney, J., Ahmed, A., Acar, U.: Provenance as dependency analysis. Technical Report arXiv:0708.2173v1, arXiv.org e-Print archive (2007).
[15]
Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing environment. ACM Trans. Database Syst. 25(2), 179-227 (2000).
[16]
Field, J., Tip, F.: Dynamic dependence in termrewriting systems and its application to program slicing. Information and Software Technology 40(11-12), 609-636 (1998).
[17]
Moreau, L., Foster, I. (eds.): IPAW 2006. LNCS, vol. 4145. Springer, Heidelberg (2006).
[18]
Geerts, F., Kementsietsidis, A., Milano, D.: Mondrian: Annotating and querying databases through colors and blocks. In: ICDE 2006, p. 82 (2006).
[19]
Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS, pp. 31-40. ACM Press, New York (2007).
[20]
Sabelfeld, A., Myers, A.: Language-based information-flow security. IEEE Journal on Selected Areas in Communications 21(1), 5-19 (2003).
[21]
Simmhan, Y., Plale, B., Gannon, D.: A survey of data provenance in e-science. SIGMOD Record 34(3), 31-36 (2005).
[22]
Wadler, P.: Comprehending monads. Mathematical Structures in Computer Science 2, 461- 493 (1992).
[23]
Wang, Y.R., Madnick, S.E.: A polygen model for heterogeneous database systems: The source tagging perspective. In: VLDB, pp. 519-538 (1990).
[24]
Weiser, M.: Program slicing. In: ICSE, pp. 439-449. IEEE Press, Piscataway, NJ, USA(1981).

Cited By

View all
  • (2018)Dac-ManProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291753(1-13)Online publication date: 11-Nov-2018
  • (2018)Dac-ManProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC.2018.00075(1-13)Online publication date: 11-Nov-2018
  • (2016)Analysis of Memory Constrained Live ProvenanceProceedings of the 6th International Workshop on Provenance and Annotation of Data and Processes - Volume 967210.5555/3090188.3090193(42-54)Online publication date: 7-Jun-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
DBPL'07: Proceedings of the 11th international conference on Database programming languages
September 2007
261 pages
ISBN:3540759867
  • Editors:
  • Marcelo Arenas,
  • Michael I. Schwartzbach

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 23 September 2007

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Dac-ManProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291753(1-13)Online publication date: 11-Nov-2018
  • (2018)Dac-ManProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC.2018.00075(1-13)Online publication date: 11-Nov-2018
  • (2016)Analysis of Memory Constrained Live ProvenanceProceedings of the 6th International Workshop on Provenance and Annotation of Data and Processes - Volume 967210.5555/3090188.3090193(42-54)Online publication date: 7-Jun-2016
  • (2014)LabelFlowRevised Selected Papers of the 5th International Provenance and Annotation Workshop on Provenance and Annotation of Data and Processes - Volume 862810.1007/978-3-319-16462-5_7(84-96)Online publication date: 9-Jun-2014
  • (2014)Regenerating and Quantifying Quality of Benchmarking Data Using Static and Dynamic ProvenanceRevised Selected Papers of the 5th International Provenance and Annotation Workshop on Provenance and Annotation of Data and Processes - Volume 862810.1007/978-3-319-16462-5_5(56-67)Online publication date: 9-Jun-2014
  • (2013)Static compiler analysis for workflow provenanceProceedings of the 8th Workshop on Workflows in Support of Large-Scale Science10.1145/2534248.2534250(17-27)Online publication date: 17-Nov-2013
  • (2013)Provenance from log filesProceedings of the Joint EDBT/ICDT 2013 Workshops10.1145/2457317.2457366(290-297)Online publication date: 18-Mar-2013
  • (2012)Datalog as a lingua franca for provenance querying and reasoningProceedings of the 4th USENIX conference on Theory and Practice of Provenance10.5555/2342875.2342888(13-13)Online publication date: 14-Jun-2012
  • (2011)Semantic invalidation of annotations due to ontology evolutionProceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part II10.5555/2075764.2075799(763-780)Online publication date: 17-Oct-2011
  • (2011)PrIMeACM Transactions on Software Engineering and Methodology10.1145/2000791.200079220:3(1-42)Online publication date: 26-Aug-2011
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media