Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Charting EDA: Characterizing Interactive Visualization Use in Computational Notebooks with a Mixed-Methods Formalism

Published: 01 January 2025 Publication History

Abstract

Interactive visualizations are powerful tools for Exploratory Data Analysis (EDA), but how do they affect the observations analysts make about their data? We conducted a qualitative experiment with 13 professional data scientists analyzing two datasets with Jupyter notebooks, collecting a rich dataset of interaction traces and think-aloud utterances. By qualitatively coding participant utterances, we introduce a formalism that describes EDA as a sequence of analysis states, where each state is comprised of either a representation an analyst constructs (e.g., the output of a data frame, an interactive visualization, etc.) or an observation the analyst makes (e.g., about missing data, the relationship between variables, etc.). By applying our formalism to our dataset, we identify that interactive visualizations, on average, lead to earlier and more complex insights about relationships between dataset attributes compared to static visualizations. Moreover, by calculating metrics such as revisit count and representational diversity, we uncover that some representations serve more as “planning aids” during EDA rather than tools strictly for hypothesis-answering. We show how these measures help identify other patterns of analysis behavior, such as the “80-20 rule”, where a small subset of representations drove the majority of observations. Based on these findings, we offer design guidelines for interactive exploratory analysis tooling and reflect on future directions for studying the role that visualizations play in EDA.

References

[1]
Using the TACT framework to learn the principles of rigour in qualitative research. 17. 9.
[2]
S. S. Alam and R. Jianu. Analyzing Eye-Tracking Information in Visualization and Data Space: From Where on the Screen to What on the Screen. IEEE Transactions on Visualization and Computer Graphics, 23(5):1492–1505, May 2017. Conference Name: IEEE Transactions on Visualization and Computer Graphics. 2.
[3]
R. Arias-Hernandez, L. T. Kaastra, and B. Fisher. Joint Action Theory and Pair Analytics: In-vivo Studies of Cognition and Social Interaction in Collaborative Visual Analytics. Proceedings of the 33rd Annual Conference of the Cognitive Science Society, 2011. 9.
[4]
O. Bar El, T. Milo, and A. Somech. Automatically Generating Data Exploration Sessions Using Deep Reinforcement Learning. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD '20, pp. 1527–1537. Association for Computing Machinery, New York, NY, USA, May 2020. 2.
[5]
A. Batch and N. Elmqvist. The Interactive Visualization Gap in Initial Exploratory Data Analysis. IEEE Transactions on Visualization and Computer Graphics, 24(1):278–287, Jan. 2018. 1, 3.
[6]
L. Battle and J. Heer. Characterizing Exploratory Visual Analysis: A Literature Review and Evaluation of Analytic Provenance in Tableau. Computer Graphics Forum, 38(3):145–159, June 2019. 1, 2, 5, 8, 9.
[7]
L. Battle and A. Ottley. A Programmatic Definition of Visualization Insights, Objectives, and Tasks, Oct. 2022. arXiv:. 1, 5.
[8]
L. Bavoil, S. Callahan, P. Crossno, J. Freire, C. Scheidegger, C. Silva, and H. Vo. VisTrails: Enabling Interactive Multiple-View Visualizations. In VIS 05. IEEE Visualization, 2005., pp. 135–142. IEEE, Minneapolis, MN, USA, 2005. 1.
[9]
A. Boggust, B. Carter, and A. Satyanarayan. Embedding Comparator: Visualizing Differences in Global Structure and Local Neighborhoods via Small Multiples. In 27th International Conference on Intelligent User Interfaces, pp. 746–766. ACM, Helsinki Finland, Mar. 2022. 2, 3.
[10]
J. Boy, L. Eveillard, F. Detienne, and J.-D. Fekete. Suggested Interactivity: Seeking Perceived Affordances for Information Visualization. IEEE transactions on visualization and computer graphics, 22(1):639–648, Jan. 2016. 8.
[11]
S. Davies. The Cognitive Psychology of Planning. In Planning and problem solving in well-defin ed domains, p. 43. The Psychology Press, 2005. 9.
[12]
D. Dotan, P. Pinheiro-Chagas, F. A. Roumi, and S. Dehaene. Track It to Crack It: Dissecting Processing Stages with Finger Tracking. Trends in Cognitive Sciences, 23(12):1058–1070, Dec. 2019. Publisher: Elsevier. 9.
[13]
M. Feng, E. Peck, and L. Harrison. Patterns and Pace: Quantifying Diverse Exploration Behavior with Visualizations on the Web. IEEE Transactions on Visualization and Computer Graphics, 25(1):501–511, Jan. 2019. Conference Name: IEEE Transactions on Visualization and Computer Graphics. 2.
[14]
A. Fouse, N. Weibel, E. Hutchins, and J. D. Hollan. ChronoViz: a system for supporting navigation of time-coded data. In CHI '11 Extended Abstracts on Human Factors in Computing Systems, pp. 299–304. ACM, Vancouver BC Canada, May 2011. 9.
[15]
K. Gadhave, Z. Cutler, and A. Lex. Persist: Persistent and Reusable Interactions in Computational Notebooks, Dec. 2023. 3.
[16]
S. Gathani, S. Monadjemi, A. Ottley, and L. Battle. A Grammar-Based Approach for Applying Visualization Taxonomies to Interaction Logs, Apr. 2022. arXiv:. 9.
[17]
H. Guo, S. R. Gomez, C. Ziemkiewicz, and D. H. Laidlaw. A Case Study Using Visualization Interaction Logs and Insight Metrics to Understand How Analysts Arrive at Insights. IEEE Transactions on Visualization and Computer Graphics, 22(1):51–60, Jan. 2016. 2.
[18]
S. G. Hart and L. E. Staveland. Development of nasa-tlx (task load index): Results of empirical and theoretical research. Human mental workload, 1(3): 139–183, 1988. 3.
[19]
J. Heer and B. Shneiderman. Interactive dynamics for visual analysis. Communications of the ACM, 55(4):45–54, Apr. 2012. 1.
[20]
W. C. Hill, J. D. Hollan, D. Wroblewski, and T. McCandless. Edit wear and read wear. In Proceedings of the SIGCHI conference on Human factors in computing systems - CHI '92, pp. 3–9. ACM Press, Monterey, California, United States, 1992. 9.
[21]
H.-F. Hsieh and S. E. Shannon. Three approaches to qualitative content analysis. Qualitative health research, 15(9): 1277–88, Nov. 2005., 3.
[22]
J. Hullman and A. Gelman. Designing for Interactive Exploratory Data Analysis Requires Theories of Graphical Inference. Harvard Data Science Review, July 2021. 2, 9.
[23]
E. Jun, M. Birchfield, N. De Moura, J. Heer, and R. Just. Hypothesis For-malization: Empirical Findings, Software Limitations, and Design Implications. ACM Transactions on Computer-Human Interaction, 29(1):1–28, Feb. 2022. 5.
[24]
A. Kale, Z. Guo, X. L. Qiao, J. Heer, and J. Hullman. EVM: Incorporating Model Checking into Exploratory Visual Analysis, Aug. 2023. arXiv:. 1, 3, 5, 9.
[25]
S. Kandel, A. Paepcke, J. M. Hellerstein, and J. Heer. Enterprise Data Analysis and Visualization: An Interview Study. IEEE Transactions on Visualization and Computer Graphics, 18(12):2917–2926, Dec. 2012. 2, 5.
[26]
U. I. D. Lab. Draco: Representing, Applying & Learning Visualization Design Guidelines, Oct. 2018. Library Catalog: medium.com. 9.
[27]
D. J.-L. Lee, D. Tang, K. Agarwal, T. Boonmark, C. Chen, J. Kang, U. Mukhopadhyay, J. Song, M. Yong, M. A. Hearst, and A. G. Parameswaran. Lux: Always-on Visualization Recommendations for Exploratory Dataframe Workflows, Dec. 2021. arXiv:. 5.
[28]
A. Lex. Opportunities for Understanding Semantics of User Interactions. In Workshop - Machine Learning from User Interactions, Oct. 2021. 9.
[29]
X. Li, Y. Zhang, J. Leung, C. Sun, and J. Zhao. EDAssistant: Supporting Exploratory Data Analysis in Computational Notebooks with In Situ Code Search and Recommendation. ACM Transactions on Interactive Intelligent Systems, 13(1):1:1–1:27, Mar. 2023. 2, 9.
[30]
Z. Liu and J. Heer. The Effects of Interactive Latency on Exploratory Visual Analysis. IEEE Transactions on Visualization and Computer Graphics, 20(12):2122–2131, Dec. 2014. 2.
[31]
M. Lombard, J. Snyder-Duch, and C. Bracken. Practical resources for assessing and reporting intercoder reliability in content analysis research projects. (2002):1–18, 2004. 4.
[32]
E. Marsh and M. White. Content analysis: A flexible methodology. Library trends, 55(1):22–45, 2006. 3.
[33]
A. Mosca, A. Ottley, and R. Chang. Does Interaction Improve Bayesian Reasoning with Visualization? In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–14. ACM, Yokohama Japan, May 2021. 1.
[34]
M. Muller. Curiosity, Creativity, and Surprise as Analytic Tools: Grounded Theory Method. In J. S. Olson and W. A. Kellogg, eds., Ways of Knowing in HCI, pp. 25–48. Springer, New York, NY, 2014. 3.
[35]
A. Narechania, A. Coscia, E. Wall, and A. Endert. Lumos: Increasing Awareness of Analytic Behavior during Visual Data Analysis. IEEE Transactions on Visualization and Computer Graphics, 28(1):1009–1018, Jan. 2022. arXiv:. 9.
[36]
C. Nobre, D. Wootton, Z. Cutler, L. Harrison, H. Pfister, and A. Lex. reVISit: Looking Under the Hood of Interactive Visualization Studies. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–13. ACM, Yokohama Japan, May 2021. 1, 2, 9.
[37]
C. Nobre, D. Wootton, L. Harrison, and A. Lex. Evaluating Multivariate Network Visualization Techniques Using a Validated Design and Crowd-sourcing Approach. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–12. ACM, Honolulu HI USA, Apr. 2020. 2.
[38]
D. Norman. The Design Of Everyday Things. Basic Books, New York, New York, revised edition ed., Nov. 2013. 8.
[39]
C. North. Toward measuring visualization insight. IEEE Computer Graphics and Applications, 26(3):6–9, May 2006. Conference Name: IEEE Computer Graphics and Applications. 2.
[40]
J. Pearl. Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley, 1984. 8.
[41]
X. Pu and M. Kay. How Data Analysts Use a Visualization Grammar in Practice. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–22. ACM, Hamburg Germany, Apr. 2023. 3.
[42]
S. Robinson and A. L. Mendelson. A Qualitative Experiment: Research on Mediated Meaning Construction Using a Hybrid Approach. Journal of Mixed Methods Research, 6(4):332–347, Oct. 2012. 2, 3.
[43]
H. Sacks, E. A. Schegloff, and G. Jefferson. A Simplest Systematics for the Organization of Turn-Taking for Conversation. Language, 50(4):696–735, 1974. Publisher: Linguistic Society of America. 9.
[44]
A. Sarvghad, M. Tory, and N. Mahyar. Visualizing Dimension Coverage to Support Exploratory Analysis. IEEE Transactions on Visualization and Computer Graphics, 23(1):21–30, Jan. 2017. 2, 3.
[45]
A. Satyanarayan, D. Moritz, K. Wongsuphasawat, and J. Heer. Vega-Lite: A Grammar of Interactive Graphics. p. 10. 3.
[46]
V. Setlur, M. Correll, A. Satyanarayan, and M. Tory. Heuristics for Supporting Cooperative Dashboard Design, Aug. 2023. arXiv:. 9.
[47]
V. Setlur and M. Tory. How do you converse with an analytical chatbot? revisiting gricean maxims for designing analytical conversational behavior, 2022. 9.
[48]
A. Suh, Y. Jiang, A. Mosca, E. Wu, and R. Chang. A Grammar for Hypothesis-Driven Visual Analysis, Apr. 2022. arXiv:. 9.
[49]
S. Theis, C. Bröhl, M. Wille, P. Rasche, A. Mertens, E. Beauxis-Aussalet, L. Hardman, and C. M. Schlick. Ergonomic Considerations for the Design and the Evaluation of Uncertain Data Visualizations. In S. Yamamoto, ed., Human Interface and the Management of Information: Information, Design and Interaction, vol. 9734, pp. 191–202. Springer International Publishing, Cham, 2016. Series Title: Lecture Notes in Computer Science. 1.
[50]
J. Thomas and K. Cook. Illuminating the Path: Research and Development Agenda for Visual Analytics. Technical report, National Visualization and Analytics Center, 2005. 1.
[51]
J. van Wijk. The value of visualization. In VIS 05. IEEE Visualization, 2005., pp. 79–86, Oct. 2005. 1.
[52]
W. Willett, J. Heer, and M. Agrawala. Scented Widgets: Improving Navigation Cues with Embedded Visualizations. IEEE Transactions on Visualization and Computer Graphics, 13(6):1129–1136, Nov. 2007. z 9.
[53]
K. Wongsuphasawat, Y. Liu, and J. Heer. Goals, Process, and Challenges of Exploratory Data Analysis: An Interview Study, Nov. 2019. arXiv:. 3, 9.
[54]
K. Wongsuphasawat, Z. Qu, D. Moritz, R. Chang, F. Ouk, A. Anand, J. Mackinlay, B. Howe, and J. Heer. Voyager 2: Augmenting Visual Analysis with Partial View Specifications. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 2648–2659. ACM, Denver Colorado USA, May 2017. 1, 2, 5, 8, 9.
[55]
Y. Wu, R. Chang, J. M. Hellerstein, and E. Wu. Facilitating Exploration with Interaction Snapshots under High Latency, Sept. 2020. arXiv:. 9.
[56]
Y. Wu, J. M. Hellerstein, and A. Satyanarayan. B2: Bridging Code and Interactive Visualization in Computational Notebooks. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology, pp. 152–165. ACM, Virtual Event USA, Oct. 2020. 3, 9.
[57]
C. Yan and Y. He. Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1539–1554. ACM, Portland OR USA, June 2020. 2.
[58]
E. Zgraggen, Z. Zhao, R. Zeleznik, and T. Kraska. Investigating the Effect of the Multiple Comparisons Problem in Visual Analysis. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 1–12. ACM, Montreal QC Canada, Apr. 2018.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Visualization and Computer Graphics
IEEE Transactions on Visualization and Computer Graphics  Volume 31, Issue 1
Jan. 2025
1353 pages

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 January 2025

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media