Unravel: A fluent code explorer for data wrangling
The 34th Annual ACM Symposium on User Interface Software and Technology, 2021•dl.acm.org
Data scientists have adopted a popular design pattern in programming called the fluent
interface for composing data wrangling code. The fluent interface works by combining
multiple transformations on a data table—or dataframes—with a single chain of expressions,
which produces an output. Although fluent code promotes legibility, the intermediate
dataframes are lost, forcing data scientists to unravel the chain through tedious code edits
and re-execution. Existing tools for data scientists do not allow easy exploration or support …
interface for composing data wrangling code. The fluent interface works by combining
multiple transformations on a data table—or dataframes—with a single chain of expressions,
which produces an output. Although fluent code promotes legibility, the intermediate
dataframes are lost, forcing data scientists to unravel the chain through tedious code edits
and re-execution. Existing tools for data scientists do not allow easy exploration or support …
Data scientists have adopted a popular design pattern in programming called the fluent interface for composing data wrangling code. The fluent interface works by combining multiple transformations on a data table—or dataframes—with a single chain of expressions, which produces an output. Although fluent code promotes legibility, the intermediate dataframes are lost, forcing data scientists to unravel the chain through tedious code edits and re-execution. Existing tools for data scientists do not allow easy exploration or support understanding of fluent code. To address this gap, we designed a tool called Unravel that enables structural edits via drag-and-drop and toggle switch interactions to help data scientists explore and understand fluent code. Data scientists can apply simple structural edits via drag-and-drop and toggle switch interactions to reorder and (un)comment lines. To help data scientists understand fluent code, Unravel provides function summaries and always-on visualizations highlighting important changes to a dataframe. We discuss the design motivations behind Unravel and how it helps understand and explore fluent code. In a first-use study with 14 data scientists, we found that Unravel facilitated diverse activities such as validating assumptions about the code or data, exploring alternatives, and revealing function behavior.
ACM Digital Library