Search | arXiv e-print repository

ChatGPT in Data Visualization Education: A Student Perspective

Authors: Nam Wook Kim, Hyung-Kwon Ko, Grace Myers, Benjamin Bach

Abstract: Unlike traditional educational chatbots that rely on pre-programmed responses, large-language model-driven chatbots, such as ChatGPT, demonstrate remarkable versatility and have the potential to serve as a dynamic resource for addressing student needs from understanding advanced concepts to solving complex problems. This work explores the impact of such technology on student learning in an interdi… ▽ More Unlike traditional educational chatbots that rely on pre-programmed responses, large-language model-driven chatbots, such as ChatGPT, demonstrate remarkable versatility and have the potential to serve as a dynamic resource for addressing student needs from understanding advanced concepts to solving complex problems. This work explores the impact of such technology on student learning in an interdisciplinary, project-oriented data visualization course. Throughout the semester, students engaged with ChatGPT across four distinct projects, including data visualizations and implementing them using a variety of tools including Tableau, D3, and Vega-lite. We collected conversation logs and reflection surveys from the students after each assignment. In addition, we conducted interviews with selected students to gain deeper insights into their overall experiences with ChatGPT. Our analysis examined the advantages and barriers of using ChatGPT, students' querying behavior, the types of assistance sought, and its impact on assignment outcomes and engagement. Based on the findings, we discuss design considerations for an educational solution that goes beyond the basic interface of ChatGPT, specifically tailored for data visualization education. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: 12 pages; 3 figures

arXiv:2310.11521 [pdf, other]

DataGarden: Exploring our Community in a VR Data Visualization

Authors: Joy Kondo, Justin Park, Josiah Kondo, Nam Wook Kim

Abstract: As our society is becoming increasingly data-dependent, more and more people rely on charts and graphs to understand and communicate complex data. While such visualizations effectively reveal meaningful trends, they unavoidably aggregate data into points and bars that are overly simplified depictions of ourselves and our communities. We present DataGarden, a system that supports embodied interacti… ▽ More As our society is becoming increasingly data-dependent, more and more people rely on charts and graphs to understand and communicate complex data. While such visualizations effectively reveal meaningful trends, they unavoidably aggregate data into points and bars that are overly simplified depictions of ourselves and our communities. We present DataGarden, a system that supports embodied interactions with humane data representations in an immersive VR environment. Through the system, we explore ways to rethink the traditional visualization approach and allow people to empathize more deeply with the people behind the data. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: 10 pages, 2 figures

arXiv:2310.09617 [pdf, other]

How Good is ChatGPT in Giving Advice on Your Visualization Design?

Authors: Nam Wook Kim, Grace Myers, Benjamin Bach

Abstract: Data visualization practitioners often lack formal training, resulting in a knowledge gap in visualization design best practices. Large-language models like ChatGPT, with their vast internet-scale training data, offer transformative potential in addressing this gap. To explore this potential, we adopted a mixed-method approach. Initially, we analyzed the VisGuide forum, a repository of data visual… ▽ More Data visualization practitioners often lack formal training, resulting in a knowledge gap in visualization design best practices. Large-language models like ChatGPT, with their vast internet-scale training data, offer transformative potential in addressing this gap. To explore this potential, we adopted a mixed-method approach. Initially, we analyzed the VisGuide forum, a repository of data visualization questions, by comparing ChatGPT-generated responses to human replies. Subsequently, our user study delved into practitioners' reactions and attitudes toward ChatGPT as a visualization assistant. Participants, who brought their visualizations and questions, received feedback from both human experts and ChatGPT in a randomized order. They filled out experience surveys and shared deeper insights through post-interviews. The results highlight the unique advantages and disadvantages of ChatGPT, such as its ability to quickly provide a wide range of design options based on a broad knowledge base, while also revealing its limitations in terms of depth and critical thinking capabilities. △ Less

Submitted 30 April, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

Comments: 24 pages, 4 figures

arXiv:2310.09614 [pdf, other]

Bridging the Divide: Unraveling the Knowledge Gap in Data Visualization Research and Practice

Authors: Nam Wook Kim, Grace Myers, Jinhan Choi, Yoonsuh Cho, Changhoon Oh, Yea-Seul Kim

Abstract: Empirical research on perception and cognition has laid the foundation for visualization design, often yielding useful design guidelines for practitioners. However, it remains uncertain how well practitioners stay informed about such crucial visualization design knowledge. In this paper, we employed a mixed-method approach to explore the knowledge gap between visualization research and real-world… ▽ More Empirical research on perception and cognition has laid the foundation for visualization design, often yielding useful design guidelines for practitioners. However, it remains uncertain how well practitioners stay informed about such crucial visualization design knowledge. In this paper, we employed a mixed-method approach to explore the knowledge gap between visualization research and real-world design guidelines. We initially collected existing design guidelines from various sources and empirical studies from diverse publishing venues, analyzing their alignment and uncovering missing links and inconsistent knowledge. Subsequently, we conducted surveys and interviews with practitioners and researchers to gain further insights into their experiences and attitudes towards design guidelines and empirical studies, and their views on the knowledge gap between research and practice. Our findings highlight the similarities and differences in their perspectives and propose strategies to bridge the divide in visualization design knowledge. △ Less

Submitted 30 January, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

Comments: 15 pages, 5 figures

arXiv:2310.09611 [pdf, other]

VizAbility: Enhancing Chart Accessibility with LLM-based Conversational Interaction

Authors: Joshua Gorniak, Yoon Kim, Donglai Wei, Nam Wook Kim

Abstract: Traditional accessibility methods like alternative text and data tables typically underrepresent data visualization's full potential. Keyboard-based chart navigation has emerged as a potential solution, yet efficient data exploration remains challenging. We present VizAbility, a novel system that enriches chart content navigation with conversational interaction, enabling users to use natural langu… ▽ More Traditional accessibility methods like alternative text and data tables typically underrepresent data visualization's full potential. Keyboard-based chart navigation has emerged as a potential solution, yet efficient data exploration remains challenging. We present VizAbility, a novel system that enriches chart content navigation with conversational interaction, enabling users to use natural language for querying visual data trends. VizAbility adapts to the user's navigation context for improved response accuracy and facilitates verbal command-based chart navigation. Furthermore, it can address queries for contextual information, designed to address the needs of visually impaired users. We designed a large language model (LLM)-based pipeline to address these user queries, leveraging chart data & encoding, user context, and external web knowledge. We conducted both qualitative and quantitative studies to evaluate VizAbility's multimodal approach. We discuss further opportunities based on the results, including improved benchmark testing, incorporation of vision models, and integration with visualization workflows. △ Less

Submitted 30 April, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

Comments: 19 pages, 8 figures

arXiv:2309.10245 [pdf, other]

doi 10.1145/3613904.3642943

Natural Language Dataset Generation Framework for Visualizations Powered by Large Language Models

Authors: Hyung-Kwon Ko, Hyeon Jeon, Gwanmo Park, Dae Hyun Kim, Nam Wook Kim, Juho Kim, Jinwook Seo

Abstract: We introduce VL2NL, a Large Language Model (LLM) framework that generates rich and diverse NL datasets using only Vega-Lite specifications as input, thereby streamlining the development of Natural Language Interfaces (NLIs) for data visualization. To synthesize relevant chart semantics accurately and enhance syntactic diversity in each NL dataset, we leverage 1) a guided discovery incorporated int… ▽ More We introduce VL2NL, a Large Language Model (LLM) framework that generates rich and diverse NL datasets using only Vega-Lite specifications as input, thereby streamlining the development of Natural Language Interfaces (NLIs) for data visualization. To synthesize relevant chart semantics accurately and enhance syntactic diversity in each NL dataset, we leverage 1) a guided discovery incorporated into prompting so that LLMs can steer themselves to create faithful NL datasets in a self-directed manner; 2) a score-based paraphrasing to augment NL syntax along with four language axes. We also present a new collection of 1,981 real-world Vega-Lite specifications that have increased diversity and complexity than existing chart collections. When tested on our chart collection, VL2NL extracted chart semantics and generated L1/L2 captions with 89.4% and 76.0% accuracy, respectively. It also demonstrated generating and paraphrasing utterances and questions with greater diversity compared to the benchmarks. Last, we discuss how our NL datasets and framework can be utilized in real-world scenarios. The codes and chart collection are available at https://github.com/hyungkwonko/chart-llm. △ Less

Submitted 21 January, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: 22 pages, 5 figures

Journal ref: In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24), May 11-16, 2024, Honolulu, HI, USA

arXiv:2108.04203 [pdf, other]

Kori: Interactive Synthesis of Text and Charts in Data Documents

Authors: Shahid Latif, Zheng Zhou, Yoon Kim, Fabian Beck, Nam Wook Kim

Abstract: Charts go hand in hand with text to communicate complex data and are widely adopted in news articles, online blogs, and academic papers. They provide graphical summaries of the data, while text explains the message and context. However, synthesizing information across text and charts is difficult; it requires readers to frequently shift their attention. We investigated ways to support the tight co… ▽ More Charts go hand in hand with text to communicate complex data and are widely adopted in news articles, online blogs, and academic papers. They provide graphical summaries of the data, while text explains the message and context. However, synthesizing information across text and charts is difficult; it requires readers to frequently shift their attention. We investigated ways to support the tight coupling of text and charts in data documents. To understand their interplay, we analyzed the design space of chart-text references through news articles and scientific papers. Informed by the analysis, we developed a mixed-initiative interface enabling users to construct interactive references between text and charts. It leverages natural language processing to automatically suggest references as well as allows users to manually construct other references effortlessly. A user study complemented with algorithmic evaluation of the system suggests that the interface provides an effective way to compose interactive data documents. △ Less

Submitted 9 August, 2021; originally announced August 2021.

arXiv:2104.11386 [pdf]

Recording Reusable and Guided Analytics From Interaction Histories

Authors: Nam Wook Kim

Abstract: The use of visual analytics tools has gained popularity in various domains, helping users discover meaningful information from complex and large data sets. Users often face difficulty in disseminating the knowledge discovered without clear recall of their exploration paths and analysis processes. We introduce a visual analysis tool that allows analysts to record reusable and guided analytics from… ▽ More The use of visual analytics tools has gained popularity in various domains, helping users discover meaningful information from complex and large data sets. Users often face difficulty in disseminating the knowledge discovered without clear recall of their exploration paths and analysis processes. We introduce a visual analysis tool that allows analysts to record reusable and guided analytics from their interaction logs. To capture the analysis process, we use a decision tree whose node embeds visualizations and guide to define a visual analysis task. The tool enables analysts to formalize analysis strategies, build best practices, and guide novices through systematic workflows. △ Less

Submitted 22 April, 2021; originally announced April 2021.

Comments: 2 pages, 2 figures, conference

arXiv:2001.04461 [pdf, other]

TurkEyes: A Web-Based Toolbox for Crowdsourcing Attention Data

Authors: Anelise Newman, Barry McNamara, Camilo Fosco, Yun Bin Zhang, Pat Sukhum, Matthew Tancik, Nam Wook Kim, Zoya Bylinskii

Abstract: Eye movements provide insight into what parts of an image a viewer finds most salient, interesting, or relevant to the task at hand. Unfortunately, eye tracking data, a commonly-used proxy for attention, is cumbersome to collect. Here we explore an alternative: a comprehensive web-based toolbox for crowdsourcing visual attention. We draw from four main classes of attention-capturing methodologies… ▽ More Eye movements provide insight into what parts of an image a viewer finds most salient, interesting, or relevant to the task at hand. Unfortunately, eye tracking data, a commonly-used proxy for attention, is cumbersome to collect. Here we explore an alternative: a comprehensive web-based toolbox for crowdsourcing visual attention. We draw from four main classes of attention-capturing methodologies in the literature. ZoomMaps is a novel "zoom-based" interface that captures viewing on a mobile phone. CodeCharts is a "self-reporting" methodology that records points of interest at precise viewing durations. ImportAnnots is an "annotation" tool for selecting important image regions, and "cursor-based" BubbleView lets viewers click to deblur a small area. We compare these methodologies using a common analysis framework in order to develop appropriate use cases for each interface. This toolbox and our analyses provide a blueprint for how to gather attention data at scale without an eye tracker. △ Less

Submitted 13 January, 2020; originally announced January 2020.

Comments: To appear in CHI 2020. Code available at http://turkeyes.mit.edu/

arXiv:1905.07984 [pdf, other]

doi 10.1145/3334480.3382980

Are all the frames equally important?

Authors: Oleksii Sidorov, Marius Pedersen, Nam Wook Kim, Sumit Shekhar

Abstract: In this work, we address the problem of measuring and predicting temporal video saliency - a metric which defines the importance of a video frame for human attention. Unlike the conventional spatial saliency which defines the location of the salient regions within a frame (as it is done for still images), temporal saliency considers importance of a frame as a whole and may not exist apart from con… ▽ More In this work, we address the problem of measuring and predicting temporal video saliency - a metric which defines the importance of a video frame for human attention. Unlike the conventional spatial saliency which defines the location of the salient regions within a frame (as it is done for still images), temporal saliency considers importance of a frame as a whole and may not exist apart from context. The proposed interface is an interactive cursor-based algorithm for collecting experimental data about temporal saliency. We collect the first human responses and perform their analysis. As a result, we show that qualitatively, the produced scores have very explicit meaning of the semantic changes in a frame, while quantitatively being highly correlated between all the observers. Apart from that, we show that the proposed tool can simultaneously collect fixations similar to the ones produced by eye-tracker in a more affordable way. Further, this approach may be used for creation of first temporal saliency datasets which will allow training computational predictive algorithms. The proposed interface does not rely on any special equipment, which allows to run it remotely and cover a wide audience. △ Less

Submitted 12 February, 2020; v1 submitted 20 May, 2019; originally announced May 2019.

Comments: CHI'20 Late Breaking Works

arXiv:1708.02660 [pdf, other]

doi 10.1145/3126594.3126653

Learning Visual Importance for Graphic Designs and Data Visualizations

Authors: Zoya Bylinskii, Nam Wook Kim, Peter O'Donovan, Sami Alsheikh, Spandan Madan, Hanspeter Pfister, Fredo Durand, Bryan Russell, Aaron Hertzmann

Abstract: Knowing where people look and click on visual designs can provide clues about how the designs are perceived, and where the most important or relevant content lies. The most important content of a visual design can be used for effective summarization or to facilitate retrieval from a database. We present automated models that predict the relative importance of different elements in data visualizati… ▽ More Knowing where people look and click on visual designs can provide clues about how the designs are perceived, and where the most important or relevant content lies. The most important content of a visual design can be used for effective summarization or to facilitate retrieval from a database. We present automated models that predict the relative importance of different elements in data visualizations and graphic designs. Our models are neural networks trained on human clicks and importance annotations on hundreds of designs. We collected a new dataset of crowdsourced importance, and analyzed the predictions of our models with respect to ground truth importance and human eye movements. We demonstrate how such predictions of importance can be used for automatic design retargeting and thumbnailing. User studies with hundreds of MTurk participants validate that, with limited post-processing, our importance-driven applications are on par with, or outperform, current state-of-the-art methods, including natural image saliency. We also provide a demonstration of how our importance predictions can be built into interactive design tools to offer immediate feedback during the design process. △ Less

Submitted 8 August, 2017; originally announced August 2017.

ACM Class: H.5.1

Journal ref: UIST 2017

arXiv:1703.00800 [pdf, other]

Creative Community Demystified: A Statistical Overview of Behance

Authors: Nam Wook Kim

Abstract: Online communities are changing the ways that creative professionals such as artists and designers share ideas, receive feedback, and find inspiration. While they became increasingly popular, there have been few studies so far. In this paper, we investigate Behance, an online community site for creatives to maintain relationships with others and showcase their works from various fields such as gra… ▽ More Online communities are changing the ways that creative professionals such as artists and designers share ideas, receive feedback, and find inspiration. While they became increasingly popular, there have been few studies so far. In this paper, we investigate Behance, an online community site for creatives to maintain relationships with others and showcase their works from various fields such as graphic design, illustration, photography, and fashion. We take a quantitative approach to study three research questions about the site. What attract followers and appreciation of artworks on Behance? what patterns of activity exist around topics? And, lastly, does color play a role in attracting appreciation? In summary, being male suggests more followers and appreciations, most users focus on a few topics, and grayscale colors mean fewer appreciations. This work serves as a preliminary overview of a creative community that later studies can build on. △ Less

Submitted 2 March, 2017; originally announced March 2017.

Comments: 10 pages, 8 figures

arXiv:1702.05150 [pdf, other]

doi 10.1145/3131275

BubbleView: an interface for crowdsourcing image importance maps and tracking visual attention

Authors: Nam Wook Kim, Zoya Bylinskii, Michelle A. Borkin, Krzysztof Z. Gajos, Aude Oliva, Fredo Durand, Hanspeter Pfister

Abstract: In this paper, we present BubbleView, an alternative methodology for eye tracking using discrete mouse clicks to measure which information people consciously choose to examine. BubbleView is a mouse-contingent, moving-window interface in which participants are presented with a series of blurred images and click to reveal "bubbles" - small, circular areas of the image at original resolution, simila… ▽ More In this paper, we present BubbleView, an alternative methodology for eye tracking using discrete mouse clicks to measure which information people consciously choose to examine. BubbleView is a mouse-contingent, moving-window interface in which participants are presented with a series of blurred images and click to reveal "bubbles" - small, circular areas of the image at original resolution, similar to having a confined area of focus like the eye fovea. Across 10 experiments with 28 different parameter combinations, we evaluated BubbleView on a variety of image types: information visualizations, natural images, static webpages, and graphic designs, and compared the clicks to eye fixations collected with eye-trackers in controlled lab settings. We found that BubbleView clicks can both (i) successfully approximate eye fixations on different images, and (ii) be used to rank image and design elements by importance. BubbleView is designed to collect clicks on static images, and works best for defined tasks such as describing the content of an information visualization or measuring image importance. BubbleView data is cleaner and more consistent than related methodologies that use continuous mouse movements. Our analyses validate the use of mouse-contingent, moving-window methodologies as approximating eye fixations for different image and task types. △ Less

Submitted 9 August, 2017; v1 submitted 16 February, 2017; originally announced February 2017.

Journal ref: TOCHI 2017

Showing 1–13 of 13 results for author: Kim, N W