Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2939502.2939508acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Clustering provenance facilitating provenance exploration through data abstraction

Published: 26 June 2016 Publication History

Abstract

As digital objects become increasingly important in people's lives, people may need to understand the provenance, or lineage and history, of an important digital object, to understand how it was produced. This is particularly important for objects created from large, multi-source collections of personal data. As the metadata describing provenance, Provenance Data, is commonly represented as a labelled directed acyclic graph, the challenge is to create effective interfaces onto such graphs so that people can understand the provenance of key digital objects. This unsolved problem is especially challenging for the case of novice and intermittent users and complex provenance graphs. We tackle this by creating an interface based on a clustering approach. This was designed to enable users to view provenance graphs, and to simplify complex graphs by combining several nodes. Our core contribution is the design of a prototype interface that supports clustering and its analytic evaluation in terms of desirable properties of visualisation interfaces.

References

[1]
J. Abello, F. Van Ham, and N. Krishnan. ASK-GraphView: A large scale graph visualization system. In IEEE Transactions on Visualization and Computer Graphics, volume 12, 669--676, 2006.
[2]
N. Balakrishnan, T. Bytheway, R. Sohan, and A. Hopper. OPUS: A Lightweight System for Observational Provenance in User Space. In USENIX Workshop on the Theory and Practice of Provenance (TaPP), 8, 2013.
[3]
D. Bearman and R. Lytle. The Power of the Principle of Provenance. Archivaria, 21(February 1982):14--27, 1985.
[4]
K. Belhajjame, H. Deus, D. Garijo, G. Klyne, P. Missier, S. Soliand-Reyes, and S. Zednik. PROV Model Primer. In W3C Working Group Note, 2013.
[5]
M. A. Borkin, C. S. Yeh, M. Boyd, P. MacKo, K. Z. Gajos, M. Seltzer, and H. Pfister. Evaluation of filesystem provenance visualization tools. IEEE Transactions on Visualization and Computer Graphics, 19(12):2476--2485, 2013.
[6]
J. Cheney, P. Missier, and L. Moreau. Constraints of the Provenance Data Model. Technical report, 2012.
[7]
E. R. Gansner, E. Koutsofios, S. C. North, and K. P. Vo. A Technique for Drawing Directed Graphs. IEEE Transactions on Software Engineering, 19(3):214--230, 1993.
[8]
P. Guo and M. Seltzer. BURRITO: Wrapping Your Lab Notebook in Computational Infrastructure. In USENIX Workshop on the Theory and Practice of Provenance (TaPP), 4, 2012.
[9]
I. Li, Y. Medynskiy, J. Froehlich, and J. E. Larsen. Personal informatics in practice: improving quality of life through data. CHI Extended Abstracts on Human Factors in Computing Systems, 2799--2802, 2012.
[10]
P. Macko, M. Chiarini, and M. Seltzer. Collecting Provenance via the Xen Hypervisor. In USENIX Workshop on the Theory and Practice of Provenance (TaPP), 2011.
[11]
P. Missier, J. Bryans, C. Gamble, V. Curcin, and R. Danger. Provabs: Model, policy, and tooling for abstracting PROV graphs. In International Provenance & Annotation Workshop (IPAW), 2014.
[12]
D. Schaffer, Z. Zuo, S. Greenberg, L. Bartram, J. Dill, S. Dubs, and M. Roseman. Navigating hierarchically clustered networks through fisheye and full-zoom methods. ACM Transactions on Computer-Human Interaction, 3(2):162--188, 1996.
[13]
M. Seltzer and P. Macko. Provenance Map Orbiter: Interactive Exploration of Large Provenance Graphs. In USENIX Workshop on the Theory and Practice of Provenance (TaPP), 2011.
[14]
B. Shneiderman. The eyes have it: A task by data type taxonomy for information visualizations. In IEEE Symposium on Visual Languages, 336--343, 1996.

Cited By

View all
  • (2024)Prov-Dominoes: An approach for knowledge discovery from provenance dataExpert Systems with Applications10.1016/j.eswa.2023.123030245(123030)Online publication date: Jul-2024
  • (2023)A Design Space for Surfacing Content Recommendations in Visual Analytic PlatformsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.320944529:1(84-94)Online publication date: Jan-2023
  • (2022)Visionary: a framework for analysis and visualization of provenance dataKnowledge and Information Systems10.1007/s10115-021-01645-664:2(381-413)Online publication date: 4-Jan-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
HILDA '16: Proceedings of the Workshop on Human-In-the-Loop Data Analytics
June 2016
93 pages
ISBN:9781450342070
DOI:10.1145/2939502
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

  • Paxata: Paxata
  • tableau: Tableau Software
  • Trifacta: Trifacta
  • IBM: IBM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. large-scale graphs
  2. provenance
  3. visualisation

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'16
Sponsor:
  • Paxata
  • tableau
  • Trifacta
  • IBM
SIGMOD/PODS'16: International Conference on Management of Data
June 26 - July 1, 2016
California, San Francisco

Acceptance Rates

HILDA '16 Paper Acceptance Rate 16 of 32 submissions, 50%;
Overall Acceptance Rate 28 of 56 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Prov-Dominoes: An approach for knowledge discovery from provenance dataExpert Systems with Applications10.1016/j.eswa.2023.123030245(123030)Online publication date: Jul-2024
  • (2023)A Design Space for Surfacing Content Recommendations in Visual Analytic PlatformsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.320944529:1(84-94)Online publication date: Jan-2023
  • (2022)Visionary: a framework for analysis and visualization of provenance dataKnowledge and Information Systems10.1007/s10115-021-01645-664:2(381-413)Online publication date: 4-Jan-2022
  • (2020)On Efficiently Processing Business Lineage Queries2020 IEEE International Conference on Big Data (Big Data)10.1109/BigData50022.2020.9377973(513-522)Online publication date: 10-Dec-2020
  • (2019)Implementations of fine-grained automated data provenance to support transparent environmental modellingEnvironmental Modelling & Software10.1016/j.envsoft.2019.04.009Online publication date: Apr-2019
  • (2018)Provenance Analytics for Workflow-Based Computational ExperimentsACM Computing Surveys10.1145/318490051:3(1-25)Online publication date: 23-May-2018
  • (2018)KEYSTONE WG1: Activities and Results Overview on Representation of Structured Data SourcesSemantic Keyword-Based Search on Structured Data Sources10.1007/978-3-319-74497-1_20(196-214)Online publication date: 8-Feb-2018
  • (2017)A survey on provenanceThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-017-0486-126:6(881-906)Online publication date: 1-Dec-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media