5.1 Effects on Empathy
We first analyze results for the State Empathy Scale [
22] across conditions. To calculate p-values, we use a one-sided Mann Whitney U-test, as we identify through a Shapiro-Wilk test across conditions that the data distribution is not normal. We determine statistical significance using a p-value threshold of 0.0083, adjusted using Bonferroni correction for six comparisons (control/regular condition compared to each six experimental condition). Note that reported p-values are relative to the control/regular text condition. The experimental conditions, bold (mean = 2.72, std = 0.80,
p = 0.11), spacing (mean = 2.72, std = 0.68,
p = 0.071), color (mean = 2.81, std = 0.82,
p = 0.014), emoji (mean = 2.73, std = 0.68,
p = 0.043), and bold + spacing + color (mean = 2.57, std = 0.74,
p = 0.31), show average increases in empathy, although not significantly to when compared to the regular text condition (mean = 2.47, std = 0.75). Interestingly, the only condition where participants’ empathy decreased relative to the regular text condition was the all features condition (mean = 2.18, std = 1.28,
p = 0.71), and this last condition had the greatest standard deviation in empathy scores. From looking at our qualitative data, we hypothesize that this is because participants found the combination of all features jarring and distracted from the underlying emotional meaning of the story. As shown in Fig.
3, participants in the color condition had the greatest increase in average empathy over the regular text condition, followed by emoji, spacing, bold, bold + spacing + color, and all features.
Looking at psychometric survey data alone offers us one view on how the font elements affected empathy with the stories. In addition, we look at what participants say in their free responses to what they empathized with in the stories. To analyze this, we use dimensions from LIWC (linguistic inquiry and word count) [
23]. In particular, we look at the total count of emotional language (emo_pos + emo_neg) used in free responses across conditions (Fig.
4). We find that when compared to the regular text condition (mean = 1.45, std = 2.49), participants in the color condition use the most emotional language on average (mean = 2.73, std = 3.18,
p = 0.017), followed by all features (mean = 1.8, std = 3.0,
p = 0.37), emoji (mean = 1.77, std = 3.23,
p = 0.54), spacing (mean = 1.66, std = 2.82,
p = 0.41), and bold (mean = 1.5, std = 3.83,
p = 0.74). Again, although the differences are not statistically significant, participants in the color condition used, on average, the most emotional language relative to the regular text condition. We hypothesize that this could be due to the fact that color draws attention to the emotions present in the text. To validate this, we also asked participants if they felt that the font design helped them perceive emotions in the text. We found that, consistent with the LIWC results, participants in the color condition reported the highest average agreement with this statement (mean = 1.95, std = 0.94) when compared to the regular text condition (mean = 1.5, std = 0.93), although not statistically significantly so (
p = 0.03).
While we found that the participants in the all features condition were distracted by too many changes in the font, many participants in this condition wrote meaningful responses to the empathy free response survey question. For example, one participant self-disclosed, “I live in a bordertown and I often think about my grandparents coming to America. They fled the pogroms in Russia. I know we have people fleeing their homeland for varies humanitarian reasons. I worry about how unwelcoming we have become. I do not know what the solution is and how we can actually help other’s be able to stay safely in their homelands.” Another wrote, “I somewhat empathized with the feelings of guilt over escaping a bad situation. This story tangentially reminded me of my experience as an LGBTQ+ individual and how although I’ve experienced oppression and hate, others in the community have experienced it to a much harsher extent.” Future work can further explore this relationship between the font properties and self disclosure when empathizing with another person’s story.
5.2 Design Considerations
We hypothesize that some challenges in conveying emotion through
AffType are limitations in the speech to font mappings. At the end of the study, we asked each participant to rate their agreement with how effective each font design change (bold, spacing, color, or emoji) was in capturing the intended speech characteristic (loudness, pace, positive/negative sentiment, or general sentiment). Overall, we found that participants were, on average, neutral to these mappings, with a slight preference towards boldness (mean = 2.19, std = 1.22), followed by color (mean = 1.94, std = 1.31), emoji (mean = 1.86, std = 1.28), and spacing (mean = 1.82, std = 1.24). Although the alterations we made to the text were motivated by prior works [
7,
26], in our application, the effectiveness of these mappings could be improved. In the rest of the section, we provide insight from participants’ responses on their likes, dislikes, and suggestions for what font design elements could improve empathy with personal stories.
For each of the following analyses, we used qualitative coding to identify core themes in participants’ responses. Three researchers independently coded the survey responses, and commonalities were extracted as major themes. As shown in Table
1, participants liked the more natural and human quality to the text, commenting on how "it felt like someone was speaking to me" and "it looked more personal." Some participants preferred spacing for matching the pace of the story and readily understood that color was associated with emotion. Other participants found boldness helpful in drawing emphasis to specific points. From Table
2, we see that participants disliked the way the font interrupted the flow of the story and commented that there was a lack of correlation between the style and meaning of the story due to too many text alterations. In addition, participants commented that emojis were not effective in promoting empathy, as they affected how the writing style was perceived and made it appear more childish.
Finally, participants expressed what they wished was different about the way the text was displayed in order to increase empathy with the story. As shown in Table
3, a major theme was using standard writing to convey emotion in the story with minimal unnatural text alterations. For example, participants suggested including explanations of how something was spoken, such as the tone they used, if they sighed, or what their facial expression was. Others suggested using common text formatting like italics and paragraphs, indicating that seamless integration of the narrators spoken emotions into the text is an important property of the system. Finally, participants suggested using photos instead of emojis to augment the story and preserve the formal quality of the writing as well as using the text to contextualize the narrator’s experiences better.
Based on participant feedback and survey results, we summarize the following design insights for AI-driven empathetic fonts: (1) readable – alterations in text should not distract from clarity of the story, (2) natural – speech to font mappings should be intuitive, (3) colorful – colors represent emotions well, (4) appropriate – alterations to text should not affect how the writing style is perceived (eg. emojis make writing more informal), (5) explainable – speech characteristics should be explained directly by the text, and (6) personalized and culturally sensitive – use of features could be interpreted differently across people and cultures.