1 Introduction
Unit visualizations [
31] have become an increasingly popular form of data storytelling in both interactive articles and videos [
15,
27]. The appeal of unit visualizations, in contrast to aggregate visualizations (e.g., bar charts), is that they harness one-to-one mappings between visual marks and data points to afford diverse, nuanced animations. These animations are especially suitable for communicating data insights as they deliver information in small chunks, smoothly leading viewers to a final takeaway message [
14,
19,
54].
However, data-driven articles and videos are notoriously difficult to create due to the inherent complexity of combining a large number of heterogeneous elements (e.g., text, visualizations, animations, interactions, etc.) into a single piece of content [
18]. To ensure that effective storytelling can occur, content must be semantically and temporally congruent [
53], which leads creators to switch back and forth between a variety of tools that are dedicated to authoring different types of content and then manually resolve any discrepancies that arise when the content is composed into a single article or video. This process results in highly fragmented, repetitive, and tedious workflows [
9,
26,
45]. These challenges can be further exacerbated by the complexity of animated unit visualizations (AUVs), which contain a large quantity of individual data points, as well as the diverse and complex animations that creators wish to author.
The goal of this research is thus to systematically explore the challenges that exist while creating data stories that employ complex and compelling AUVs, and then identify opportunities to streamline the creation process. To this end, we conducted a 2-part formative study that consisted of an interview-based study with 6 AUV creators to understand the challenges in their workflows, and a content analysis of 44 data-driven articles and videos to investigate the characteristics and patterns contained within such media.
The formative study found that authoring data stories with AUVs is extremely challenging because it is difficult to plan and prototype. Typical prototyping techniques (e.g., sketching) cannot accurately capture the number of visual units nor represent the dynamic animations that creators envision. As a result, creators resort to using coding languages to programmatically specify the parameters and behaviors of hundreds and thousands of data points. This process, however, is indirect, tedious, and time-consuming. The content analysis found that data stories use a section-based narrative structure, and within each section, the text in the story closely corresponds to the underlying data selections, operations, visual encodings, and animations needed for each AUV.
These formative studies highlighted the opportunity to leverage the desired connections and organization between text and visuals to supporting planning, exploring, and composing of data stories with rich AUVs. We propose DataParticles, a system that facilitates the authoring process of AUVs with two components, a language-oriented animation authoring technique that enables users to leverage natural language to create congruent visualizations and animations, and a block-based story editor that enables creators to flexibly iterate on their story narrative while maintaining congruence with the visuals. To assess the utility of DataParticles, we conducted an expert evaluation with 9 professional data story creators. The results showed that the language-oriented, block-based authoring environment enabled the authoring process to be story-focused and facilitated creation by enabling flexible prototyping.
This research thus contributes:
(1)
A formative study with 6 expert creators that identifies the common workflows and pain points that exist when creating data-driven stories with AUVs.
(2)
A content analysis of 44 stories with AUVs that identifies common patterns their narrative structures and the mappings that exist between the text and visuals.
(3)
A prototype system, DataParticles, that leverages the text and visual mappings to create a language-driven, block-based authoring experience for data stories with AUVs.
(4)
An expert evaluation that confirmes the utility of DataParticles, revealing insightful design implications for future design prototyping and language-oriented authoring tools.
While this research focuses on supporting creating data stories with AUVs, we envision the proposed user interface and interaction techniques can be applied to other types of data visualizations (e.g., aggregated charts).
2 Related Work
This work draws on prior research on unit visualizations, data storytelling, natural language oriented interactions and block-based interfaces for content creation.
2.1 Unit Visualizations
Unit, or glyph-based, visualizations are visualizations that represent every data point using a unique visual mark, whose visual and spatial properties encode the attributes of the data point [
5,
15,
31]. The rich visual representation of each individual data point has resulted in unit visualizations being an increasingly popular medium for communicating data in a range of scenarios, such as infographics [
62], data-driven animations [
3], AR visualizations [
8], and data physicalizations [
1]. Prior work has presented a series of tools, frameworks, and design spaces to facilitate the creation of unit visualizations. For example, Drucker and Fernandez characterized the design space of common unit visualizations and proposed a unifying framework [
15]. Park et al. further explored the design space of unit visualizations and contributed an expressive grammar that formally specified unit visualizations [
31]. This design space was later used during the development of SandDance, which supported the exploration, identification, and communication of insights about data using unit visualizations within a graphical user interface [
31]. Besides using simple visual marks such as squares or circles to represent data points, some authoring tools have been developed to support the mapping of data points to glyphs with complex visual structures [
5,
8,
62].
While these systems were shown to be effective at supporting the creation of
static unit visualizations, creating
animated unit visualizations remains a challenge. State-of-the-art data animation systems, such as data animator [
52], created animated data visualizations by keyframing and interpolating between static visualizations. With such tools, significant manual effort is still required to create rich, complex AUVs. Recently, Lu et al. [
27] presented a method to automatically generate AUVs by arranging data-facts However, the limited number of rule-based visual graphs and template-based text descriptions limited the expressiveness and controllability of the generated results. In contrast, DataParticles leverages the
context or textual storytelling that AUVs are often embedded alongside to reduce the manual effort required when creating AUVs by inferring and recommending animations that were congruent to the story.
2.2 Data-driven Storytelling
Data-driven storytelling studied how to leverage visualizations to effectively communicate data insights. Segel and Heer categorized seven genres of data-driven storytelling, including articles and videos [
36]. Due to the storytelling power they offer and the challenging authoring processes required, these forms of narrative visualizations have been gaining traction in HCI and visualization communities over the past few years. Prior work in this domain has contributed to understanding the workflow of current authoring practices. Lee et al., for example, identified that data storytelling processes often involve three major steps: finding insights from data, turning insights into narratives, and communicating this narrative to the audience [
26]. Similarly, Sultanum et al. showed that authors usually started with a dataset or a particular question to answer using the data and then organized the main findings into a linear structure to present [
45]. During this process, synthesizing information across multiple resources, making decisions about rich visual effects, and contextualizing data facts into coherent narratives were found to be challenging [
18,
45]. To address these challenges, significant research has been conducted to summarize design spaces to guide and simplify the authoring process [
2,
7,
12,
45,
46,
47] and to develop new authoring tools [
4,
13,
24,
40,
45,
48].
While all of this prior work has made great strides to leverage visualization techniques for chart-based visualizations in general, little is known about how unit visualizations are used in data-driven articles and videos. Due to the flexibility and expressiveness of unit visualizations, their design and authoring involves specific challenges and opportunities. Our research seeks to further the understanding of current practices used to author AUVs and the usage of AUVs for storytelling, especially how rich animations can support narratives.
2.3 NLI for Data Visualization
Significant research has leveraged natural language interfaces (NLI) to create and manipulate data visualizations [
39]. For example, Voder automatically generates data facts in natural language sentences with interactive widgets to facilitate data understanding and exploration [
41]. DataTone enabled users to create aggregated visualizations with natural language queries and provided interactive widgets to resolve the ambiguities inherent in natural language expressions [
16]. Similarly, Eviza leveraged the pragmatic structures in natural language and enabled users to create and revise data visualizations through continuous and interactive conversations [
37]. Orko [
44] and InChorus [
42] supported multimodal interaction with visualizations by leveraging natural language.
To facilitate the development of NLIs for visual analytics and visualization design, Narechania et al. proposed a toolkit that used a tabular dataset and a natural language query as input and returned structured specifications of data visualizations [
29]. Srinivasan et al. provided a benchmark of NLIs for visualization using a curated a dataset of visualization-oriented utterances [
43]. ArkLang described an intermediate language that built upon a set of analytical concepts to infer intents in underspecified natural language utterances [
38]. This prior research laid a foundation for us to further explore and leverage the links between natural language and data visualizations.
Instead of requiring users to make explicit natural language queries and specifications, recent work has leveraged the latent correspondences that exist between linguistic structures and content structures in more casual natural language expressions [
9,
10,
23,
59,
60]. For example, CrossData provided rich interactions to help users author data documents by automatically establishing text-data connections based on users’ natural writing in a document [
9]. Kori linked visuals and natural language by suggesting references between text inputs and charts in an existing gallery to support the creation of an interactive visual story [
23]. DataParticles draws from this line of research by employing desired mappings between visual and natural language throughout one’s storytelling process. DataParticles leverages the flexibility of unit visualizations to create animated visual stories that are congruent with the narrations. It supports a range of interaction techniques that facilitate the synchronized creation and iteration of AUVs and the underlying narrative story.
2.4 Block-based Editing for Data-driven Content Creation
Computational notebooks (e.g., Jupyter [
20], R Markdown [
33], and Observable [
30]), are modern embodiments of Knuth’s notion of literate programming [
22], which has been widely used in data science for data exploration and communication [
34]. Many researchers have explored extending such notebooks with features including real-time collaboration [
56], the synchronization of code and chat [
57], and the synchronization of code and interaction results [
58]. Recently, Lau et al. analyzed 60 computational notebooks and proposed a design space to characterize notebooks, revealing that most notebooks adopted a block-based editor style [
25]. Compared to traditional document-based editing environments, block-based editors enable users to independently edit, execute, and rearrange blocks within a document, thereby enabling for the progressive authoring and immediate visual feedback.
The modulated planning technique is also commonly seen in the two-column scripts for planning visual stories [
11,
61]. With a two-column format, film scriptwriters organize the narration in one column and the visual directions in the other (e.g., camera movement and background setup), where each row of the script specifies all the necessary information for a scene [
32]. This technique has been broadly adopted by video creators to allow consistent organization and flexible reorganization of segments of videos.
DataParticles leverage the block-based editing paradigm to scaffold the authoring process for data-driven articles, in which authors frequently iterate, re-organize, and review the content they create. Considering these benefits, we explore the fusion of language-oriented authoring and block-based editing to offer intuitive, flexible interactions while creating AUVs.
3 Formative Study
This research did not begin with a specific focus on animated unit visualizations. Instead, we were initially interested in understanding how data-driven stories that employ complex animations were created and the challenges that exist throughout creators’ workflows. To achieve this, we conducted an interview-based study with experts who had significant experience employing visualizations and animations for storytelling. As animated data stories are often created by a group of creators, we recruited experts across different roles to have a comprehensive understanding of such workflows. Through these interviews, we found that among the various types of data stories the interviewees had created, the ones that consisted of AUVs were noted as being both compelling and challenging to create, as they exacerbated many of the pain points of complex animation authoring processes.
3.1 Interviewees and Procedure
Semi-structured interviews were conducted with six domain experts who had professional experience creating data stories with AUVs. Each of the interviewees had over 5 years of experience creating different types of animated data stories, where their roles acorss producer, animator, visualization engineer, or journalist.
The interviews began with background questions about interviewees’ roles and professional histories. Interviewees were then asked about their workflows when creating stories with AUVs, including the tools that they used, and the pain points they encountered throughout the process. We summarized our findings in terms of the workflows and challenges in creating AUVs.
3.2 Workflows While Creating AUVs
The workflows employed while creating animated data stories consisted of three major steps. First, narratives about the data stories were developed to guide production. Second, visualizations and animations were created based on the narrative planning, starting from low-fidelity prototypes and moving to high-fidelity designs. Third, the narrative and visuals were composed to form a complete story. While these three high-level steps were commonly taken during the entire creation by all interviewees, we identified two different approaches during planning and prototyping phases: data-driven planning and story-driven planning.
In the
data-driven planning approach, creators started with a general topic and an existing dataset. They then analyzed the data, gathered excerpts and organized the excerpts into a coherent story, which is consistent with the findings from prior research [
26]. In this planning, creators often conducted a data-driven inquiry to learn more about the topic and created videos to communicate the insights they had discovered from the data analysis. This approach is often used to communicate complex ideas or data in a more accessible manner, making the information easier to understand and remember. The
story-driven approach is a narrative-based approach, where creators begin with a written narrative and a pre-determined sequence for presenting their data. They then searched for data facts that aligned with their story and incorporating visualizations to provide supporting evidence. The focus is on creating an engaging narrative that captures the audience’s attention and delivers information in a compelling and convincing way.
Despite the different processes while planning, both processes shared similar pain points. Herein, we report on the key challenges that creators encountered for each step and highlight how these challenges would be exacerbated during the creation of AUVs.
3.3 Challenges Encountered by Creators
3.3.1 Difficulties and Downstream Frustrations While Planning and Prototyping AUVs.
All interviewees employed a two-column script to plan their data stories, where the first column broke down the text narrative into sections and the second column showed the corresponding visuals to be created. However, interviewees noted that specifying desired AUVs was extremely challenging compared to static graphics and aggregated visualizations because the animations of units were difficult to prototype using sketching or digital prototyping tools such as Keynote. It was thus challenging to speculate on the final look of an AUV in terms of whether it would coherently support their narrative. Due to this lack of concrete specifications during the planning stage, creators often made many design decisions during the creation process without knowing whether they might deviate from their overall storytelling goal, requiring more downstream iterations. On the other hand, the significant amount of effort needed to adjust complex AUVs made interviewees reluctant to make substantial changes at these later stages. As E1 noted, "I feel like I’m randomly making 10 design decisions every second, the first time you finish this [AUV] is the first time you see the result, and you never want to go back ."
3.3.2 Tedious and Complex Workflows During AUV Creation.
Creating high-fidelity AUVs required interviewees to use several tools to specify each different aspects of their design. Interviewees often first used spreadsheet applications (e.g., Excel) and programming toolkits (e.g., R, Python) to clean, prepare, and analyze their data. To animate visualizations, interviewees often resorted to using programming libraries (e.g., d3.js) instead of using direct manipulation tools, which would become very tedious with large datasets and limit the exploration of the visual encodings and animations. However, the nature of programming does not lend itself to rapid exploration, which demanded significant effort from interviewees and also implied a high barrier of entry to creating this type of visualization. As noted by E6, “we have to have a decent background in both design and code to be able to build them, along with a considerable amount of time.” In addition to using the typical view transitions found with unit visualizations, data stories often employed ‘per data point’ animations (e.g., highlighting and pulsing) to guide users’ attention through the flow of the story. Although these animations were not technically challenging to create, the sheer number of them required tedious manual effort. Especially in video-based stories, interviewees used dedicated animation tools (e.g., Adobe AfterEffects) for final touches or animations that were difficult to create using programming.
3.3.3 Repetitive Synchronization for Story-Visual Congruence.
Interviewees stated that matching the story with the visuals was crucial for effective communication and storytelling. Although the story and visuals were initially planned to align, discrepancies in the content (e.g., data insights), format (e.g., movements of the units), and timing (e.g., duration of animation) between the visuals and the story often arose. In the final stage of creation, interviewees underwent multiple refinement iterations focused on eliminating the discrepancies and ensuring congruence. Changes involve tweaking animation timing, modifying unit visualization layout, or altering the story narrative based on new insights gained during the creation of the AUVs.
Interviewees attributed the discrepancies to the separate development of the visualizations and story as well as the lack of immediate feedback on how the visuals corresponded with the underlying story. This resulted in the need to constantly switch between editing the story and the visuals. The late discovery of these discrepancies often caused more effort to be required in later stages, even for issues that could have been easily resolved earlier.
3.4 Summary
The findings revealed that while one-to-one mapping between data points and visualizations provides AUVs with the narrative power to express certain concepts, it also makes AUVs extremely challenging to plan, explore, and prototype. In addition, the authoring process is distributed across a wide range of tools and environments, resulting in difficulties in maintaining the linkage between the story, data and visuals. The findings suggest that creators of data stories can significantly benefit from flexible prototyping and structured scaffolding for the entire authoring process, especially earlier in the planning stage.
7 Expert Evaluation
To evaluate DataParticles and its two core interaction techniques (i.e., language-oriented and block-based editing), we conducted an expert evaluation to understand how the tool could address creators’ pain points and its potential to be integrated into their workflows, as well as to identify any limitations of and ideas for extending DataParticles.
7.1 Participants
We recruited nine experts in the domain of data storytelling (i.e., 5 females, 4 males, aged 27-37) through emails. All participants had more than 3 years of experience in data storytelling and had professional experience authoring AUVs. The studies were conducted remotely via Zoom. Participants accessed DataParticles through a web browser. They received a $40 dollar gift card for the 75-90 minute session.
7.2 Study Protocol
Participants first completed a consent form and were then instructed to complete three tasks with DataParticles via screen sharing during the study. The study concluded with a feedback questionnaire and post-study interview.
Introduction and System Walk-through (∼ 20 minutes) . Participants were first given a general introduction to the motivation behind DataParticles. The experimenter then walked the participants through DataParticles ’s features by guiding them through creating a tutorial story using a dataset containing ten data items with four data properties. During the walkthrough, the experimenter described the interaction verbally and asked participants to perform specific actions. After the system walkthrough, participants were given time (∼ 5 minutes) to practice using DataParticles.
Reproduction Task (∼ 15 minutes) . Participants were asked to reproduce the coffee pod story described in section6.3. The dataset contained 28 data items with 5 properties. The text narratives and corresponding visualizations were provided in a Google Doc. Creation Task (∼ 20 minutes) . Participants were given a dataset of 32 jean pockets, adapted from an existing AUVs story [14]. Participants were instructed to create their own story with 3-5 blocks We encouraged participants to try different story narratives and asked them to think aloud during the creation process. Questionnaire and Semi-structured Interview (∼ 20 minutes) . After completing all the tasks, participants filled out a questionnaire about the usefulness and usability of DataParticles using a 7-point Likert scale, followed by a semi-structured interview for feedback on the system utility, effectiveness, and future improvements.
7.3 Results
All participants successfully completed the three tasks. Most participants found the interface easy to learn (3 strongly agree, 6 agree) and easy to use (2 strongly agree, 7 agree). They also provided that they would likely use DataParticles to create data stories (1 strongly agree, 7 agree, 1 somewhat agree). We first report the participants’ responses to the two interaction techniques and our observations of how they use DataParticles and then discuss the system limitations we learned from the participants.
7.3.1 Feedback on Language-oriented Authoring.
Participants found that being able to immediately see the visualizations helped them focus on the story they were telling, and they generally agreed that the visualizations generated by the system matched their narrations (2 strongly agree, 6 agree, 1 somewhat agree). This supports the first design goal (D1) of DataParticles, which was to maintain the connections between narrative stories and the visualizations.
Participants found that the system’s ability to automatically maintain congruence between narrations and visualizations allowed them to “get directly to the story” (P1). This was made possible by the integrated environment offered by the system, which was a marked improvement over participants’ previous workflows, which employed spreadsheets to explore data but struggled to focus on the story, e.g., “like a spreadsheet, you just feel you are not pointing to the right direction of telling the story” (P1) and “I have to kind of mold the story after the processing the data” (P5). Additionally, using natural language to express thoughts was found to be a more direct and natural way of communicating, especially in the early stages of story creation where the story was usually guided by high-level intentions rather than specific configurations. P2 described the process as “Really cool! Because that’s the natural way we think about a story, so you don’t let all the tedious configurations and editing get in the way of your thoughts.” Overall, participants thought that the system was “a lot faster” (P3) to prototype with and more effective than their previous workflow.
7.3.2 Feedback on Block-based Editing.
The block-based authoring environment allowed participants to easily explore the data (2 strongly agree, 5 agree, 2 somewhat agree) and try out different narratives (2 strongly agree, 5 agree, 2 somewhat agree), supporting the need for flexible prototyping. As P1 noted, “it allows you to be more creative without worrying too much if it is making sense.” P3 said “I really like the idea of using blocks to author the flow. You can basically do all the things that you are interested in.” In fact, the flexibility of prototyping was not only offered by the operations of blocks, but also by the fact that using blocks ensured that visualizations remained synchronized with corresponding parts of the narrative (6 strongly agree, 2 agree, 1 somewhat agree). P5 also mentioned that the initial auto-generated block, which displayed all of the data points, was helpful for getting started and encouraged their exploration. However, they also suggested that it would be beneficial to offer more options to configure the initial layout based on the story context and the size of the dataset.
The ability to propagate visual states was preferred by participants, who reported that it enabled them to easily prototype and explore different story flows (3 strongly agree, 6 agree). Participants found this function useful when the sequence changed during prototyping. They tended to click the [ propagate ] button for multiple blocks that came after a changed block to update the visualizations. In contrast, the [ play ] button was mostly used when participants had finished their stories and wanted to preview the final result.
7.3.3 Workflows with DataParticles.
During the construction of the story, the authoring process followed a pattern of exploration, construction, and polishing. In the exploration stage, the authors tended to incrementally reveal data facts by inputting sentences that involved a single segment of the dataset. For example, instead of mentioning the pocket size for men and women in one sentence, most participants (8/9) explored the gender and size properties separately. Moreover, it was common to start by adding several blocks and then delete them. In the construction stage, participants adjusted the sequence more often by inserting and adding blocks that they had explored before than dragging blocks as dragging could lead to significant changes in the entire sequence, which made them feel less in control (P9). Participants may thus benefit from block operations that further support less disruptive design exploration, as discussed in the limitations section below.
Multiple participants mentioned that they valued being able to quickly see whether their story made sense or not. DataParticles offered a unique way to plan by using visual reasoning. As P6 noted, “Sometimes I am not sure whether to use size, position, opacity, or color to represent the data... it can be hard to think of, so why not just write down what makes the most sense and see how it looks?” The potential for visual reasoning is further discussed in Section 8.2. 7.3.4 System Limitations.
Our participants had mixed opinions on the creative freedom provided by the system (1 strongly agree, 2 agree, 3 somewhat agree, 2 neutral, 1 somewhat disagree). We identified limitations of the current system that may explain this in terms of the expressiveness of the visual content and flexibility in supporting prototyping.
While DataParticles was able to support most data selections and operations during the authoring process, participants felt limited by the visual effects available in the system. This was due to a lack of comprehensive effects and mappings between natural language expressions and these effects. For example, P9 wanted to encode data points “using different shades of blue” which is not currently supported by DataParticles. Participants also reported wanting to create animations that simulated camera movements, such as zooming in and out of scenes, and that corresponded to more complex narrative structures, such as using a side-by-side view to present the comparisons of groups of data points.
Participants also found that the linear structure of blocks limited the flexibility of the prototyping. We observed that participants constantly added and deleted blocks to explore different flows, before finalizing the story. P5 said “it would be great if I could review my editing history and choose from them.” P8 and P9 also desire to see multiple designs at the same time. These findings suggest that DataParticles could benefit from additional block organization and operations, such as branching, hiding blocks, and grouping and folding sequences of blocks, to support more flexible and less disruptive parallel prototyping.
We recognized that the current task design encouraged a “data-driven” planning process, where participants first explore data without polished narratives. Within the process, participants felt limited by the data exploration DataParticles offers. P7 explicitly mentioned “I see it fits into the middle part of my workflow, when I have the dataset cleaned up and want to figure out what to tell.” The current blocks only maintained the link between the text and the visuals, which could benefit from setting linkage to the dataset. As P4 noted, “It would be powerful to show different data view for different blocks.” Interestingly, participants also mentioned they imagined using it for the story-driven planning process. As mentioned by both P2 and P8, it would be useful if users could paste their script into the system and generate visualizations using simple markups. This may be challenging to the current prototype system, which is designed to support well-chunked and descriptive text input. The design decision was made with the unique language characteristics of data stories, where the visualizations unfold incrementally with small changes. While it was able to properly understand the participants’ intentions during the study, this may be insufficient for a story-driven planning. In order to support this, the system should be able to handle more flexible text input with improved natural language processing capabilities.
7.4 Summary
All nine expert participants were excited about the potential of using DataParticles to improve their workflows when creating data stories with rich AUVs. They appreciated that DataParticles enabled them to quickly visualize their ideas and experiment with different story ideas like a “digital sketch book for data.” Specifically, language-oriented authoring enabled participants to be story-driven and block-based editing enables quick and flexible prototyping. The study also noted that the limited set of visual effects and block operations offered by the system may have restricted users’ creative freedom and could prevent them from fully expressing their design ideas with the tool.
8 Discussion and Future Work
Based on our observations throughout the study, we believe that there are several opportunities to improve the system and provide more support for the creation process. In this discussion, we summarize them in terms of streamlining the creation process, supporting visual reasoning, and extending the concepts to broader visualizations. We hope that with future research in these areas, we can enhance the capabilities of DataParticles and make it a more powerful and versatile tool for AUV creation.
8.1 Supporting the Entire Creation Process
While the participants appreciated the prototyping power of DataParticles, they pointed out that more “low-level control” is needed to produce high-fidelity content and create unique stories and visuals. P5 said, “the upside of that (using code) is that each project is distinctive and we don’t have to settle for pre-set styles and patterns.” Moreover, intelligent tools can lead to downstream frustration because creators may not know the limits of the system. P6 said, “I am willing to put more time and effort into my final product... so I don’t mind learning a tool that gives me every bit of control over it.”
Participants suggested that the system should be able to export designs in a format that can be improved with other dedicated tools. As we found in the formative study, developing content in separate environments may result in breaking the congruence between components, which have required significant effort to establish at earlier stages. As our research has shown, by uniting the story and the corresponding AUVs within the blocks, creators can freely explore different story narratives without breaking the connections between the narrative stories and AUVs. Therefore, we believe that to support a fluid workflow for the entire design process, the intrinsic relationships amongst various content components should be preserved throughout the entire creation process.
One idea to explore is to extend our tool from a two-column to an N-column format to accommodate the various elements and tools involved in the creation process. With N-columns, iterations can incrementally occur between columns with a side-by-side view, allowing creators to gradually mold a low-fidelity prototype into a higher-fidelity product using more advanced tools and programming toolkits. To achieve this vision of streamlining the creation process, it is critical to find proper intermediate representations that can preserve enough information from lower-fidelity stages and be flexible to use in higher-fidelity stages.
8.2 Support Visual Reasoning
In our formative research, we identified two approaches to data-driven storytelling: a data-driven approach, where the story was planned based on the data facts that creators wanted to convey, and a story-driven approach, where creators used the data and visualizations that best fit their story. Our user study also revealed an alternative approach, where participants focused on visual storytelling. For example, P8 said, “I found it interesting that when I constructed the story with this system, I was reasoning on whether the visualizations and transitions between them made sense to me rather than the text itself. After I was satisfied with the animations, I went back to the text and completed the story.” This aligned with what we observed in many of the data-driven stories that employed rich AUVs, from which the key data insights were often self-explanatory using just the visual content.
Moreover, to facilitate visual comprehension and engagement, AUVs are also blended with other types of visual content (e.g, images, simulations, and character animations). These blendings could create more immersive and engaging narratives that better situate the visualizations within the context of the story. For example, when the story was about basketball players, the units were animated to bounce like basketballs on the ground [35]. When discussing constellations, the visual units were animated to rotate in a circular layout like actual constellations and situated within a collection of satellite images [6]. These visualizations are compelling yet more challenging to create due to the high level of creativity it requires. We are interested in exploring the potential of incorporating the state-of-the-art generative models into our system to facilitate the creation of these visualizations. With the advent of powerful large language models (e.g., GPT-3), we envision an intelligent system that could suggest possible blending strategies given the narratives to help better situate visualizations within the story.
8.3 Beyond Unit Visualizations
The type of unit visualization supported by DataParticles falls between aggregated visualizations (e.g., bar charts, line charts) and more complex particle animations (e.g., waffle charts, physical simulations). Although these visualizations are not within the scope of our system, they are commonly used in data storytelling [17] and could potentially benefit from a more flexible prototyping experience. Herein, we discuss the opportunities and challenges when extending our system to support these types of visualizations. Aggregated visualizations contain simpler graphic elements and more structured layouts, which make them a promising candidate to establish a link between text and visuals throughout a story, especially with the help of existing work in this domain [9, 38]. However, unlike unit visualizations, whose visual marks almost always remain unchanged during animations, transitions of aggregated visualizations often employ complex graphical transformations, such as transforming a line chart into a bar chart by transforming lines and dots into rectangles. On the other hand, physical simulations, while using consistent visual marks, often employ the complex movements of particles to represent semantic meanings. In contrast to the 1-1 mapping between data points and visual marks in unit visualizations, other types of visualizations often utilize more flexible mappings between the data points and the visuals. In both aggregated visualizations and simulations, a visual unit could represent multiple data points or only selected data points, where the number of visual units could be determined by the narration. For example, a waffle chart that consists of a grid of small cells often animates according to the approximated percentages mentioned in the text, rather than being truly data-driven. The complex animations and flexible data-visual mappings do not conflict with language-oriented and blocked-based editing that DataParticles proposes. We are interested in extending DataParticles with a more advanced animation engine and intelligent data selection techniques, to support a broader coverage of visualizations.