In Section
3, we
3 described the process by which we created inclusive and accessible imagery in collaboration with designers at Google. However, as we interviewed stakeholders and accessibility practitioners in the company, we found that outside of research collaborations such as ours, in-depth iteration on alt text with feedback from users was rarely if ever used by practitioners. Section
5.1 will describe the four types of alt text creation described by practitioners or used in collaboration with Google: The User-Evaluation Process (used in study 1), The Lone Writer Process, The Team Write-A-Thon Process, and The Artist-Writer Process. For each process we make an effort to describe the key steps involved in the process, how relevant actors (users, alt text writers, artists or designers, etc.) interact or inform the final alt text, and best features or largest drawbacks of the creation processes compared to one another. Following that, we briefly describe three additional themes that point toward areas for improvement in future alt text production processes in industry.
5.1 Four Models of Alt Text Creation
Before we present the four different models of alt text creation, we begin by describing the general context in which the models our interviewees discussed were situated. The accessibility practitioners we spoke to had varying degrees of familiarity with organizational approaches to accessibility outside the specific projects and teams they had worked on. According to AP2, Google's approach to accessibility “[is] on multiple levels… There's a bottom-up approach; there is a bit of a top-to-bottom approach.” This top-to-bottom and bottom-up approach was demonstrated in the alt text creation processes as well. AP1 described the organization-level requirements for alt text passed down to product teams: “In any product launch, you have the accessibility review and a legal review, amongst others like engineering code reviews… [so] you have a [Quality Assurance] tester—accessibility tester—then going through the content [to assure image descriptions] make sense in the context of the task that they're trying to accomplish in order to test it.” In addition to these top-down processes, more bottom-up resources were used. For example, AP1 and AP3 both discussed being aware of or using a variety of alt text guidelines existing in different forms and formats across teams. The manifold nature of the organization–as a large tech company with many teams and divisions–required a similarly manifold approach to accessibility and alt text generation in particular, demonstrated in the differing methods that accessibility practitioners described.
The role of individual interest and advocacy as a complement to existing organizational structures was also an ongoing theme in discussions of accessibility practices with the practitioners. AP2 pointed out there were both formal and informal structures to support accessibility, with many individuals–particularly engineers–having their personal “curiosity” supported by provided training and networking opportunities: “Notwithstanding the fact that people can just connect with each other [through] bug bashes and workshops and events… there's plenty of opportunities to learn about accessibility.” As we will see in the next section, personal passion and dedication among accessibility practitioners was particularly valuable in some alt text generation methods such as the Lone Writer process (Section
5.1.2). This overarching context for the four alt text creation methods also highlights the importance of organizational support and formal policies in enabling robust alt text creation.
5.1.1 The User-Evaluation Process.
We begin with the alt text generation method we are calling The User-Evaluation process. Demonstrated in Section
4, we are calling this the User-Evaluation process because it involved the direct evaluation and revision of image descriptions with users. In our enactment of the process in Study 1 we particularly focused on centering the perspectives of users with disabilities through interviews and focus groups. However, theoretically the core feature of the User-Evaluation process could be seen as including users in direct evaluation and revision of image descriptions through almost any method (e.g., surveys, usability tests, A-B testing) depending on the resources and time available or the amount of qualitative feedback desired.
An important element of the User-Evaluation method is how knowledge and responsibility is distributed. Acting as the alt text writers in Study 1, we based our original version of the alt text on the descriptions given by the artist, but otherwise had creative control over the alt text and how user feedback was integrated. User comments impacted the final alt text, but (at least in our case) did not filter back to the artists. A2 particularly said any feedback “would be a great enrichment and I would gladly change any part of my illustrations.” A1 was similarly open to changing the images based on user feedback. Because the User-Evaluation process took months, coming back to the artists and asking them to redraft the images was logistically impossible. It is important that, unlike the Artist-Writer process (Section
5.1.4), it can be hard to iterate on both imagery and alt text using this process because, as Study 1 shows, the final received understanding of the alt text is impacted not just by the alt text but the imagery itself. Thus, a drawback of the User-Evaluation process may be the close collaboration and careful managing of timelines needed to properly distribute the feedback from users to all players in the imagery creation process.
The practical feasibility of the User-Evaluation process may be its largest drawback, as buy-in at least was very high for the method. User feedback was something all of the artists (A1–A3) and accessibility practitioners who had experience with directly creating alt text or images for user consumption (AP1 and AP3) discussed as valuable. AP3 explained that “If possible, it would be great to get some actual feedback from users… based on these descriptions that I wrote, how confident do they feel in selecting an image as their profile picture? …. Should I have been more detailed? … Or was it okay to be really spare?” The issue of user feedback may be particularly pertinent for AP3 because, as we'll discuss in Section
5.1.2., she used the Lone Writer process and may have felt more fully responsible for the overall alt text. She said, “I think if I were going to do a big set again, it would be good to actually get some feedback.” The enthusiasm for user feedback among our participants suggests that this process may be ideal for situations where buy-in from different stakeholders is important, such as image descriptions with broad cross-product or cross-organization applications.
5.1.2 The Lone Writer Process.
In contrast to the many potential stakeholders involved in the User-Evaluation process, the Lone Writer process is in some ways the most straightforward. The process is characterized by a body of images being passed to a single alt text writer, who creates the alt text for all the images, before they are passed down to users. The Lone Writer process, as it was described to us by AP3, was: “Working on a collection of 800 illustrations where I did write the accessibility descriptions for all of them.” Describing her process, AP3 said: “I sat down and just started going through as many as I could… I think I did a calculation that it would take me about a minute to write each one so whatever 800 minutes is… yeah, I ended up working a couple of weekends to get this done.” The Lone Writer process is defined by this single solitary workstream, where rather than collaboration within or across groups of actors (users, writers, artists) the content flows unidirectionally through the layers of alt text creation.
One of the features of this process was that AP3 was able, because she was the only writer of the alt text, to keep a certain level of consistency between images. “I dug up the guidelines that I already knew about from my first team, where I was a temp and read through them… Some of the guidelines for that, they were going to make things too long, because it was you know, like ‘10 words’ and I was like ‘Oh, I think, like six maybe for this’ just because I don't want [it] to take four hours for somebody to go through the whole library… one of my guidelines for myself was like, don't refer to colors [because they could be customized by the user]… I tried to limit myself to four to six words generally and using present tense.” These adaptations of existing guidelines–and the resulting similarities in length and tense across the 800 images–were simple to decide on and implement as AP3 was solely responsible for the alt text. Whether this degree of consistency is ultimately a positive in terms of user experience was not assessed.
However, one clear logistical drawback of the Lone Writer process is the potential, as seen in this example, for a high burden to be placed on one individual. AP3 at multiple points described the effort put into the 800 images over two weeks. When asked if this was a task she more or less assigned to herself, she agreed “it definitely was.” She explained that she worked on it over the weekends “just because it was really important to me to do it.” She felt personally responsible for creating more accessible images, to the point that she put in time outside of working hours. And she described if a similar situation happened in the future, “we've got to have these descriptions and I will write them even if I have to do it over two weekends. I think overall it's important to do this work.” The Lone Writer process by its nature means that a high degree of creative burden, both in terms of time and in terms of responsibility, falls on one person.
Comparing AP3’s experience with the Lone Writer process to our experience with the User-Evaluation process for the Avatar Project also suggests some potential tradeoffs of this method. AP3 characterized the images as including a wide range of subjects, including real life locations, objects arranged in a room, and humans or pets engaged in activities intended to be relatable to users. This is very different from the detailed portraits of individuals or concepts included in the Avatar Project, which were intended specifically to encourage reflection on inclusion and identity. This may partially explain the significantly shorter amount of time–approximately one minute per image–that AP3 took to create the alt text compared to the several months long and more than half-dozen iterations the User-Evaluation method required. This suggests that for complex images the Lone Writer process may not scale, or conversely, that for large sets of images changes would have to be made to the User-Evaluation process as we conducted it.
5.1.3 The Team Write-A-Thon Process.
AP1 and AP2 both had experience with the second type of alt creation model that we're here calling The Team “Write-A-Thon” Model. The essence of the Team Write-A-Thon process is a group of alt text writers collaboratively developing alt text. AP1 spoke to this model when she described a “fix-it” day she had organized for her co-workers to participate in alt text creation. She describes “fix-its” as “an engineering team concept… where you get everyone together and you have a list of tasks and you just like run through them all because that's the best and easier way to get the job done that has not been figured into anyone's priorities yet. No surprise, that's always alt text.” She said for the specific “fix-it” she ran she “got anyone, not just designers but anyone from our team, like forty people maybe, to sit down and describe the images and … then [co-worker] and I [would] then review it and add it to the CMS.” The fix-it approach was described as one way that multiple alt text writers can have a chance to sit down together and work on accessibility as a group.
The Team Write-A-Thon Process, as a collaborative approach, has the potential to reveal how varied people's opinions are about writing image descriptions. As AP1 described, “It was also a good experience if you want to see just how different two different people can think the alt text should be… someone is going to say, ‘this is a mobile UI with this, this, this.’ Someone else is gonna say like, ‘a news app and five buttons.’ And you're like, ‘I dunno? Would that be the same thing?’” The subjective nature of alt text production becomes more obvious when, as in AP1’s case, there are many writers describing images with “a lot of repeated constructs” but the alt text produced is very different. AP1 felt that the fix-it day resulted in understanding “a few different standardized ways of describing things… [Because] there's like three different ways of saying what really should be the same thing. So actually, consistency is a big chore across it all.” Thus, the Team Write-A-Thon model may also create a level of consistency similar to the Lone Writer model, but through consensus rather than a single person's decision. It may also have the advantage of placing less of a burden of work on just one person.
Although getting agreement from multiple people may result in more clear and consistent understanding across the group, it can be very time consuming and laborious trying to reach that level of agreement. AP2 describes some of the drawbacks of a collaborative approach: “When we're redesigning the [product] website… the question did become pretty quickly… ‘I'm going to be spending the next half a year labeling 1,000 images, or is there a better way we can spend our time?’… Now you've got 20 designers sitting in the room trying to say like ‘Well, this is the right way to say it. Do I give enough information? Do [I] not give enough information?’” Essentially, while a collaborative approach between many writers “[done] in a very standard fashion” may ideally lead to a distributed load with higher consistency, in practice it can lead to more time and effort being spent coordinating the writers and getting consensus from everyone.
5.1.4 The Artist-Writer Process.
This brings us to the last alt text production process we are going to describe: The Artist-Writer Process. For the prior processes, we did not discuss the role of the artist because the role of the artist was always minimal, with a clear hand-off from the artist well before and completely separate from the alt text creation process. The Artist-Writer Process, in contrast, occurs when there is either direct collaboration between image creators and alt text creators, or when the image creator and the alt text creator is the same person. A key finding here was that who counts as an “image creator” is actually more complicated than assumed. Two of the three artists we interviewed explained that their role as an artist was not simply to create an illustration, but to collaborate with a person writing the words accompanying the image. A2 said very clearly: “There is always a lot of mediation and collaboration between the art director, the author, the illustrator and the reader.” A3 described “the traditional model” as being a collaboration between “the copywriter and the art-director.” When an artist becomes part of an Artist-Writer process, they become a more involved co-creator of the final meaning of the alt text.
The Artist-Writer process has two variations. The first is having the image creator and the alt text writer be the same person. AP1 explains this is, in her experience, rare: “[my colleague is] not like any other person who contributes imagery in that she does write the alt text herself.” The more common practice in AP1’s team before the Team Write-A-Thon method was attempted (see Section
5.1.3), was the second version of the Artist-Writer process. Under this version of the model, a single author wrote the text of the webpage, made decisions behind the images, and “[it] was not a complete submission until it has alt text, designers must contribute to alt text.” This meant there was still a single source of creative control over imagery and alt text, but that it took the form of one person overseeing the artist and writing the text, rather than actually making the imagery themselves.
Having a single author in charge of content, imagery, and alt text had drawbacks, however. AP1 explained: “The goal… is that the person who is responsible for the information about a given page or topic would also write the alt text for the image, [because they are] the subject matter expert on it. But in practice it never works.” The reason it didn't work, according to AP1, was that general writers did not have specialized knowledge of or experience with alt text: “I feel like the amount of training versus the frequency someone would do it… the overhead was too high. Like someone might actually only end up writing alt text once a year [so] they are not going to get good at it.” In theory, AP1 felt, “[it's] important education-wise, but they are not going to be a good contributor.” AP1 knew “I'm going to have to rewrite it.” Thus, the Artist-Writer process had tradeoffs to either quality or conservation of effort.
However, there are still valuable elements to the Artist-Writer approach. For one thing, the educational element of distributing alt text production labor across a large team should not be dismissed. There is an implication that if a minimum alt text quality could be assured, it would avoid the time or manpower bottlenecks seen in the Lone Writer or the Team Write-A-Thon processes. Lastly, the unified authorial intent that the Artist-Writer process facilitated was closest to the understanding of “intent” we and end users had as our baseline understanding in study 1. This may make it a process that could be easily integrated with elements of the User-Evaluation process because, as we'll discuss in the next section, clear delineations between the methods are not always necessary or entirely accurate to the in situ experience of the processes.
5.1.5 Deconstructing Delineations Between Types of Alt Text Creation.
Despite labeling and describing these four alt text creation processes as distinct, in practice the delineations between different types of alt text creation processes were not always so clear cut. AP3 explained even in the Lone Writer process, that she didn't work entirely alone: “There were a few images where it was a lot harder [to write alt text] … so I did set aside some of them [to give to a colleague] as like ‘I'm having trouble… What feedback can you give me?’” Even, or perhaps especially, writers who take full responsibility for alt text don't work in complete isolation, making the distinction in practice between the Lone Writer process and the Team Write-A-Thon process not always easy to determine. Similarly, the User-Evaluation process, as we practiced it in Study 1, involved writing the iterations of the alt text not just in collaboration with the users, but in collaboration with different researchers on the team, making it reminiscent of the Team Write-A-Thon approach.
Additionally, even in methods that did not directly elicit user feedback–as in the User-Evaluation process–artists and accessibility practitioners discussed prior knowledge and experiences with disabilities informing their approach to accessible imagery in a way that complicates the simple forward march of ideas from creators to end users. A1 explained: “Since my sister is autistic and is non-verbal, I was exposed early on to neurodiversity and different disabilities…. My uncle has a prosthetic leg so that was just a very normal sight as a child… These things really opened my eyes about the concept of accessibility for all different kinds of people as well as the constant daily challenges that exist everywhere because our society was built for neurotypical, able-bodied people.” AP3 similarly said that her personal relationship to assistive technology made her more passionate about making imagery accessible: “I don't have to use a lot of assistive features… but there are things that are helpful for me even though I don't need them, like automatic captioning of YouTube videos… if I want to watch a YouTube video in bed without waking up my spouse it's lovely to be able to actually do that without having to get out of bed and go find earbuds.” People with disabilities, even in more direct models where accessible imagery is not being created with direct consultation with users, still impact the way accessibility is done, because all actors in the imagery and alt text creation process are filtering information through their past experiences, including–for many–personal experiences with people with disabilities.
5.2 Three Further Themes from Inclusive and Accessible Imagery in Practice
In this section, we describe three additional themes from the interviews with image and alt text creators that ultimately shape the overall process of inclusive imagery creation in the studied company. First, we describe the way artists and accessibility practitioners fulfilled the expectations users in Study 1 had by considering the intended use case when creating imagery and alt text. Next, we describe how the alt text creation processes we have just described were impacted by the macro pragmatic and micro personal considerations of different actors, including the organization. Lastly, we describe how inclusive imagery as it was done at Google had clear champions for the words and the images but no set role to champion the alt text, leaving a potentially fillable gap in future alt text creation processes.
5.2.1 Intended Use Cases and Identity Depictions in Inclusive Imagery.
Just as users in Study 1 believed it ought to be, we found the intended use case for the images impacted the entire inclusive imagery and alt text production process. For example, AP3 thought about the use case when deciding how or if to explicitly mention race in alt text: “We have an image of a person riding a motorcycle… [and the alt text just says] ‘Person riding motorcycle.’ You're just trying to keep it pretty generic…so that these images could appeal to anyone.” Whether aracial images or image descriptions actually do appeal to anyone is a much larger question, but what is relevant is that AP3 felt that because these images were to be user icons, she needed to keep the descriptions of people ambiguous.
The artists working on the Avatar Project, likewise said the imagery's intended use as an inclusive tool was an important factor when deciding how to depict identity. A2 summarized the project as “Google asked me to create a series of avatars that represented diversity… the goal was simply to represent the widest possible spectrum of human beings” including an image of disability that emphasized “an extremely shareable and ordinary moment” where “the lack of limb [was] visible but not overwhelming.” A1 similarly felt that they “didn't want our depictions to tokenize or degrade any person or specific disability, or flatten them to just that… it was the goal for each and every avatar to show a multi-dimensional story of the character… not centering a disability as a single identity.” Despite the inclusive purpose, artists discussed wanting to keep a degree of intentional ambiguity. A1 explains that “we had specific ages/genders/races in mind, but the final product keeps a lot of them pretty ambiguous… Just because I assigned a specific gender to a character in my mind, doesn't mean that people will/should see it the same way.” A3 felt that “any type of successful art of any kind has these layers… where it doesn't hit you over the head, but it kind of suggests something and leaves a little bit of room [for interpretation].” Inclusive imagery, in the minds of artists, did not always require explicit representation of identities.
What was interesting was that, despite the role of artists in depicting identities, there was not a clearly defined path of intent from artist to artwork. A1, talking about themselves and the character designer they worked with, said: “I feel our art was adapting to the purpose/subject matter… This art was for a client, and it had a specific purpose to represent people inclusively.” A2 similarly argued that “the job of an illustrator (at least in my case) is to represent an idea, a concept, a state of mind, that is usually someone else's, the writer of the text.” The pre-figured intent coming from the project sponsor therefore informed both the alt text and the artwork (see Figure
2).
5.2.2 Balancing Pragmatic Considerations versus Personal Intuition and Expertise.
Another tension we saw at play when artists and accessibility practitioners were creating accessible imagery was the need to balance pragmatic concerns of the organization and users, with the individual agency and expertise of the contributing image and alt text creators.
There were a host of pragmatic considerations that accessibility practitioners considered when deciding how to create alt text. AP2 valued human generated alt text, but explained he would never be the person to say, “‘Okay let's leave everything else and just work on the alt text,’ because we've got 10 thousand pages full of images and we're going to be spending the next year labeling those images.” AP1 emphasized that pragmatically when getting new people up to speed on writing alt text, she tells them to focus on what about the image is “meaningful” or “most salient” which “[is] tricky because [what] you're asking… someone to do is say ‘What is this image's role in the [webpage]? What is it there for?’” Thus, both the scale of a large company, and the complexity of producing useful alt text when writers are inexperienced affects the methods and approaches used to create alt text.
There are also personal levels of expertise that inform how individuals create inclusive and accessible imagery within the context of these larger constraints. For example, A3 explained that when they decide to move from researching a topic to creating the images themselves there is a certain amount of intuition that they have as an artist. “It's just a balance…I have to read about and at least sort of understand what's going on…, but I think it's just sort of intuitively knowing when… I've done enough of that.” AP1 has a similar level of expertise in her role as a UX writer. She explains that “[she] take[s] responsibility, happily, for the way language and image work in our product” and that in that role of responsibility, she understands that “oftentimes the secret is that someone made the image without a clear intent or without a purpose and that's when alt text becomes really hard.” From both artists and writers. their personal experience and expertise in their field allows them insight into how accessible and inclusive imagery needs to be produced and this balances with the tangible time constraints and pragmatic scale considerations to create the final processes and deliverable user experiences.
5.2.3 Understanding the Relative Importance of Words, Images, and Alt Text.
Another element that influenced the approaches image creators and alt text creators took to creating accessible imagery was how each of the different players in the process valued words, images, and alt text. AP1 explained that asking someone “to only do the alt text is like clearly missing a big part. Because like what is the surrounding? The adjacent text is a big part of that experience too.” Describing the way images, surrounding text, and alt text work together, AP1 argues that: “A visual example, even if it just reinforces the concept is probably still useful…[but] the adjacent text is probably already saying exactly what I would want [in] the alt text…[it is] this in-between thing where… this is not so essential to meaning, [but] it's not decorative.” The parallel relationship between the written words, imagery, and alt text is shown in Figure
3. Although, as AP1 shows, the written text, the imagery, and the alt text, are not—in practice—always clearly defined in relation to each other. AP2 even gives the example of describing one's appearance at a conference to add an additional layer to the puzzle: “Some people will say… ‘oh you should describe yourself to other people’… I feel like if people hear my voice, I want them to experience just my voice without knowing how I look… There's a beauty in not knowing too much about [a] person.” Thus, image description processes, in practice, are about balancing the overlapping information sources that are available to different users and considering how to best combine words, images, and image descriptions to achieve the desired result.
Perhaps unsurprisingly, in cases where images, words, and alt text are all working together, the perspective from which you look at the problem determines what is seen as the most important. For example, A1 explains “it took a lot of thought to figure out what disabilities would translate visually and the most clearly.” AP1 explained one area where she put considerable time and effort was perfecting the words and images as part of the overall experience of the product she was working on:
“I will say that something I'm proud of with this site, even with the compressed time… We tried to create identities… even though alt text treats the imagery as though it were arbitrary, It's not. We were very deliberate in choosing a few different people, different ages, different identity, and then also building out a little mini world for them too because once you see someone's phone screen it implies like a whole social circle…this is a more complex representation of identity than the previous generation of our products… there's an intentional little story there… But now I'm wondering… the alt text sphere has no idea we made all these choices, right?... Sometimes the same thing is true for... for all users, which is that I wish there were ways of scaling its complexity, [making versions which are] more sophisticated… because right now I can only treat it as this thing where I have to say as little as possible, as efficiently as possible.”
Alt text was ultimately thought about last and least because there was not an expert on alt text or screen readers’ experiences to weigh in. It never “figured into anyone's priorities” as AP1 said earlier. In the overall production process, images and words both have champions, the artists and the UX writers, but the alt text is lacking the same kind of organizational champion role.