1 Introduction
Do you have a podcast? You could be the creator of one of the 473,870 active shows on Apple Podcast in August 2023 [
63], or have published one of the 85,047,441 episodes made available there since 2005 [
36]. With an exponentially growing library and listenership since the first podcasts in 2003 [
111], and 1 out of 10 UK adults planning to start a podcast in 2022 [
86], it is clear that podcasting has become a key feature of our media landscape. If the present for podcast creators around the world is a seemingly boundless space, filled with encouraging promises of things to come, the future of this medium is still blurry.
There are multiple new technologies currently in the process of “revolutionising” production methods and listener experience; to list only a few: new transcription solutions using
artificial intelligence (AI) to generate subtitles for episodes [
71,
106], semantic audio editing [
15], spatial audio capabilities in listening devices (like the Apple AirPods Pro) and programs [
16,
55], the development of tools allowing for new types of spatialised audio experiences, like Audio Orchestrator (a BBC Makerbox tool responsible for immersive podcasts like
Monster [
18] and
Spectrum Sounds [
20]), and the growing interest in object-based media and its potential applications to podcasts, through adaptive podcasting [
40] or non-linear programs [
102]. These projects are all akin to forms of personalisation, where personalisation serves the overall goal of immersion [
56]. However, relatively little is known about podcast creators’ perspectives on these technologies, how they are integrating them into their workflows, and what they consider the technologies’ impact to be on listeners. That is because the data available focusing on podcast creators are often limited to demographic information rather than their opinions on their field.
For other traditional media, like film and radio, perspectives of industry professionals have been thoroughly documented, in dedicated academic publications [
26,
27,
69,
110]. Comparatively little is known about the corresponding aspect of podcasting.
As listeners, our enthralment with on-demand audio content can be linked to several facets of the medium. Its versatility of genres and styles widens with every passing year. Shows explore novel formats and transcend expectations, reaching new audiences [
72]. The episodic nature of podcasts fosters the loyalty of listeners [
84], and boosts engagement, by encouraging them to experience the episodes by, for instance, organising listening parties or releasing complementary content [
107]. Motives for listening to podcasts are varied, but the ideas of divertissement and social belonging appear in several studies looking at reasons behind podcast consumption [
89].
To media producers, podcasting’s appeal is threefold:
(1)
Podcasts reach over 41% of people over 12 years old in the US every month [
41], a percentage that has grown yearly. This wide and increasing audience constitutes an incentive for both larger broadcasting companies (like the BBC, NPR, Megaphone, iHeartRadio) and independent creators to invest resources and time into the production of podcasts.
(2)
These investments have a good chance of turning a profit. Indeed, the podcasting industry was worth $11.46 billion in 2020 [
49], thanks to advertisements, sponsored content, and more recently, paying subscriptions [
8].
(3)
The creative freedom the medium offers allows for projects to find unique spaces in which to develop.
What will innovative, personalised, and immersive podcasts of the future be like? To answer this question, we need to better understand the current practices, behavior, and perspectives of podcast creators. This will enable us to paint a picture of the primary user of these new podcasting tools, and thus provide designers and researchers the opportunity to justify future design decisions on the basis of empirical data.
The question of the future of podcasting could also be addressed by asking other actors within the podcasting industry what they would want and need from podcasts in the future. Becker et al. [
21] recommends using an inclusive approach to designing audiovisual software that melds Human-Computer Interaction and Audience Studies into one Audiovisual Design method. This method is based on an analysis of the roles of different actors:
audience (for podcasting, the passive listeners),
synthesiser (the listeners who curate content for others),
modifier (the listeners who reproduce, repackage, or otherwise interact with the content by modifying it),
player (engaged listeners, keen to interact), and
producer (content creators) of a product. To this, one can add the non-negligible role of advertisers and platforms [
100]. Each of these actors will have different expectations from next-generation podcasting, and particularly, different expectations of a tool for producing and delivering these new podcasts [
21].
When considering existing work on the experience of these actors and their expectations, we note that literature that details the needs of the listeners is widely available [
7,
73]. Advertisers and platforms are disembodied actors, which makes understanding their expectations more complex, and they are not the target users of tools that will be built. Therefore, in this article, we focus on “podcast creator” as a target user, and our premise is that understanding their motivations and expectations could be the missing piece to a tricky puzzle where not only the user experience should be considered, but also the content and end products [
21].
1.1 Research Aims
Being able to characterise a target user in design is a critical milestone to reach when designing new tools [
60]. The aim of this study is to gather critically relevant information about how creators (a) produce podcasts, and (b) explore the affordances of personalisation in innovative forms of podcasting, with near- rather than long-term innovations in mind. Via interviews, we will piece together a representation of the podcaster as a target user, including their habits and views on next-generation podcasting, so that designers, researchers, and stakeholders interested in the development of new tools for podcasting can consider a legitimate user type and their requirements.
To cover these different aspects of a podcast creator’s perspectives, we will focus on the following research questions:
RQ1.
What do podcast creators envision as “next-generation podcasting”?
RQ2.
What tools do podcast creators use and why?
RQ3.
How would new tools and habits be integrated to podcasters’ established production workflows?
5 Discussion
How podcast creators envision innovation and then explore and produce with innovative techniques are matters that will affect the experience of millions of listeners worldwide. This article describes interviews that were undertaken with 16 podcast creators, in order to shed light on creators’ perceptions of next-generation podcasts, indicate how innovation might be incorporated into production workflows, and formulate some requirements and expectations of tools built to create new forms of audio-centric programming. In this final section of the article, we summarise and interpret the results of our qualitative and quantitative analyses, draw some conclusions, and finish by identifying some limitations and ideas for future work, laying down the foundations necessary to make advances in the world of podcasting, particularly in terms of production tools and listener experience.
5.1 What do Podcast Creators Envision as “Next-generation Podcasting”?
Creators interviewed in this study expressed two separate goals that appear to contradict each other. The first is to improve listener experience, through a combination of new formats, higher quality audio, or finding ways for content to be more engaging for audiences; the second is to simplify and streamline their production process, by using faster, smarter, more efficient tools. However, practices that could simplify the creator’s work, like synthesising voices, sound effects, or un-mixing music to separate tracks, could have the adverse effect of worsening the listener experience overall. Conversely, adding features to podcasts in order to improve listener experience could greatly complicate an already-convoluted workflow.
By looking at RQ1, “What do podcast creators envision as next-generation podcasting?,” we bring to light this duality in expectations. Participants agree that next-generation podcasting should involve a form of improvement of the listener experience (a “listener-centric” vision of next-generation podcasting), but they also show an interest in using the technologies presented as tools to simplify production and reduce their workload (a “creator-centric” reaction to our demonstrations). Aligning these two approaches to podcast innovations could be paramount to improving both the listener’s and the creator’s experience of podcasting. For all interviewees but one cautious independent producer, this combined improvement is synonymous with “next-generation” podcasting. Regardless of its application (listener- or creator-centric), purpose-driven innovation prevailed in participants’ reasoning, with the aim of easily producing better quality, more entertaining, informative, engaging and immersive content at the centre of “next-generation podcasting.”
In turn, this allows us to narrow down the research field for “next-generation podcasting,” by looking at the technologies that would best suit this scenario. The nuance our qualitative analysis brings to our quantitative results helps us decipher our participants’ answers and bring into focus technologies that appear plausible candidates for “next-generation podcasting,” while discarding more problematic ideas. For instance, we can discard motion-based recognition, as both Touch/Tilt and Gesture Recognition raised concerns over accessibility and disability, and generally went against the idea of podcasts being a “hands-free” medium. Reverse-engineering music to separate stems and synthesising sound effects or voices could very well be an asset for creators, but participants interviewed expressed reservations concerning ethics, authenticity, and quality, which could potentially hinder the listener experience more than the creator’s process would benefit from the implementation of these tools.
It is also important to remark that the technologies presented in the video demonstrations evoked similar ideas in participants. Four creators were interested in creating podcasts that adapted to the listener’s environment; three were keen to explore location-based personalisation; two wanted to create podcasts that varied depending on the listener’s time of day. Participant K said:
I am really looking forward to the day when a mobile device can respond to the environment that the listener was in, and automatically change the dynamics of the content, or change the loudness of the content presented to the listeners, so if you’re on a subway and it’s very loud, it will decrease the dynamic range of the content and perhaps turn it up just a little bit for you to make it easier to listen to.
These notions of adaptivity and reactivity to attributes on the user side are consistent with the concept of “perceptive media” (media that perceives one’s actions and then adapts to them), as coined by Ian Forrester, and his goal to create podcasts that adapt to the listeners [
40].
Across the board, participants have stressed that they do not want to overwhelm the listeners with decisions, like Participant D who explained: “Trying to get listeners to interact or do anything\(\ldots\) It’s non-existent.” Any interactivity should therefore work in a non-intrusive fashion, hand in hand with immersion, as a means to achieve it rather than as a distraction from it, and have a clear purpose.
5.2 What Tools do Podcast Creators Use and Why?
Podcast creators value easy-to-use, highly compatible, “no-code” software. Due to the lack of standardisation within podcast production practices, both independent and BBC-affiliated creators use a variety of tools to record, edit, and distribute their podcasts. But, within this multitude, the corollaries of what is usually a collaborative process prevail, with creators favouring highly compatible, simple-to-use tools. Via RQ2, “What tools do podcast creators use and why?,” we attempt to shed light on the habits and expectations of practitioners pertaining to their software and equipment, which could give us insights into the requirements a podcasting tool should aim to fulfil.
According to answers to interview question (3) (What attributes make for good podcasting tools?), a podcasting tool should be efficient, compatible, useful, comfortable, and good value for the money (in order of importance, from most important to least important to the group of participants). This should be read in the context of participants’ current practices. For instance, the six BBC creators agreed that their choice of software was influenced by the habits of people they worked with, yet, they mention using four different DAWs (question (2): What tools are necessary for your work?). Although compatibility seems high on their list of priorities, personal preferences and background appear to play a bigger role in their choice of editing software, which speaks to the conflicting expectations of seeking universality, but lacking conformity.
This lack of conformity, but need for universality, means any new podcasting tool should aim to offer widespread support across different work tools. The need for simplicity and lack of coding expertise from the participants (question (5): Do you have any experience with coding? And would the need for coding deter you from using a tool?) informs us that any podcasting software should be very easy to use and not require any programming skills.
Understanding the desired functionalities and attributes of a new media tool is fundamental to its development [
112], and, by studying the requirements and expectations of podcast producers, we present a solid foundation on which innovative podcasting tools could be built.
5.3 How Would New Tools and Habits be Integrated to Podcasters’ Established Production Workflows?
Integration of innovation will come in pre- or post-production phases. Podcast production is a complicated process, which, for the sake of producers, should be simplified rather than complexified further. Some apps such as Anchor.fm
6 take this approach of drastically simplifying the podcast production process, with all the steps required for basic podcast production (Figure
1) contained within one single web app. But, if the purpose of these new tools is to add features or improve substantially on existing ones, it can be expected that a minimal modification to the archetypal workflow presented would need to occur.
In order to answer RQ3 (“Where and how would new tools and habits be integrated into their established production workflows?”), we ask about the specifics of each participant’s workflow (interview question (4):
Do you have a particular workflow when creating a podcast?). Thanks to the archetypal production workflow detailed in Figure
1, we can begin to imagine where new software would fit best. We found that podcast production was a highly iterative process, and that therefore, we should respect the loops already in place (like writing
\(\leftrightarrow\) recording
\(\leftrightarrow\) editing), or take precautions to preserve them, but also not shy away from introducing another step that a creator could loop into their existing workflow. This analysis suggests a new step could be embedded as part of the pre-production phase, before or in tandem to booking, or in the post-production phase, after editing but before distribution.
5.4 A Note on Accessibility
Prince [
87] acknowledges that podcasts are “unusually accessible,” referencing ease of use, low cost, and the flexibility that transcriptions offer to deaf or hard-of-hearing listeners. However, this last feature relies on the assumption that most podcasts would use transcripts, and that those would be of good quality. Seven participants discussed accessibility, just under half of the total number of creators interviewed. It seemed widely agreed upon that podcasts are not the most accessible in their current form, often lacking proper transcripts or simplified/audio-described interfaces. Often, transcripts for podcasts are not available, and a complicated feature for creators to include in their programmes. Although some tools already exist that facilitate this process (AI transcription tools, distribution platforms that specifically query for transcripts, etc.), these solutions often come at a cost for the creators. The multitude of distribution options and hosts, each with their own upload platforms and requirements, makes it harder for creators to expect and rely on the same accessibility features from one project to the other. This lack of consistency might in turn discourage some potential listeners. There is a clear reflection on these accessibility shortcomings in the medium as a whole by the aforementioned participants.
5.5 The Podcast Creator as a Target User
These interviews have revealed key facts about podcast creators that can be used to better plan and design tools for next-generation podcasting, as covered by the prior subsections detailing our findings. Overall, we take away that podcast creators (1) are interested in delivering better, more immersive and engaging experiences to their listeners, (2) have an already-complex workflow comprised of a wide range of tasks and skills, (3) are looking for ways to simplify this complex production process, (4) want their production tools to be efficient, compatible, useful, comfortable, a good value for the money, and no-code, (5) are looking for ways to adapt their podcasts to their listeners, (6) are concerned with accessibility and reaching as wide an audience as possible, and (7) are wary of unethical uses of AI in media.
5.6 Limitations
The exploratory nature of this study required choices to be made in preparation for the interviews. For instance, although we have justified the inclusion of the 12 demonstration videos presented, they do not represent an exhaustive list of technologies that could be used for next-generation podcasting, but rather a selection of technologies that could be implemented within a time frame appropriate to our wider research project. The demonstrations presented may therefore be perceived as a subjective collection of potential technologies, with their inclusion (and the exclusion of others) justified by the aim of our research.
Overall, we registered an average interest in the technologies demonstrated that was above the middle of the rating scale. This could be explained by participant self-selection—the recruitment process may have appealed to people who were particularly passionate about the application of new technology to audio and related media.
We chose to consider the participants’ views on technologies for content and interface personalisation independently, because the technologies in these two groups served fundamentally different goals, and were presented with a short break in between. Overall, the content and personalisation categories are rated as interesting as one another, with a median interest in these two groups of technologies of 4 on a 1–5 Likert scale.
Potential bias that some creators may have had due to prior familiarity with certain technologies also needs addressing. This might have led them to have a more favourable impression of the technologies of which they were already aware, and in turn skewed the data towards these concepts, like non-linear narratives, where all participants were familiar with various existing incarnations. It is unclear whether the high interest registered for this concept was due to a general, mainstream knowledge of the technology compared to other demos, or to a real preference.
We did not systematically collect data regarding the size of the teams in which the creators worked. We can, however, comment on our impression that results were consistent regardless of team size or affiliation, and that these two factors did not seem correlated.
Our thematic analysis was led by one investigator but checked by two others. By its very nature, qualitative analysis is subjective, but the researchers have tried to minimise human error by following current best practices [
25,
31,
50], through acknowledging bias, and having discussions surrounding the codes and results [
34,
68].
This study focuses solely on exploring the creators’ current production habits and outlook on the future of podcasting, in order to bridge a gap noted in the literature regarding the role and expectations of the professional podcaster. Other actors’ points of view, like those of the listeners, advertisers, or platforms, could be explored to better contextualise the research presented in this study.
5.7 Future Work
The design guidelines (Section
5.5) uncovered by the interviews can be applied to research and development projects across the industry that focus on delivering more immersive and personalised programmes to the user. Object-based audio personalisation is one of the many facets of interactivity that could be further explored using the recommendations and insights revealed by this article [
113]. Tool designers who want to make it easier to produce innovative podcasts that cross over with other media—like game-ified podcasts [
2,
104], or podcasts that exploit the non-linear possibilities of long-form content—could benefit from the conclusions of this article.
The question of standardisation within the podcasting world also comes to mind. With all the emerging possibilities for the medium, are we moving further away from the fixed format we are used to—the one that relies on a single, immutable MP3 file? Forrester [
40] argues that an intermediary format that would support more adaptable podcasts is necessary, and proposes this could be achieved through authoring.
7 This would enable podcasts to be compiled on the user’s device, relying on a markup language to deliver personalised audio content. This kind of innovative format should be supported not only by adequate production tools but also by bespoke delivery systems.
Beyond the changes in the creators’ outputs that this study could support, the evolution of producers’ and writers’ creative agency and their perception of creative agency within the context of more interactive programmes could be explored. The development of next-generation podcasts and podcasting tools will bring forth new questions surrounding both creator and listener experience, helping us understand our relationship both to audio content and personalised multimedia.
5.8 Conclusion
This study delves into the intricacies of podcast production and the concept of next-generation podcasting. It explores the current practices of podcast producers, revealing their archetypal production workflows and habits, and postulating that these preferences could form the basis for podcasting innovation and research in the future. We investigate the perspectives of independent and mainstream creators on next-generation podcasting, bringing to light their expectations for tools enabling better listener experiences, but also tools that facilitate their work, and their view of how a selection of new technologies could be leveraged within their work. The amalgamation of these findings allows us to hypothesise how a new podcasting tool could be implemented within existing production habits. This work suggests that creators would be receptive to easy-to-use, highly compatible, “no-code” software, which integrates easily within complex pre- or post-production setups, and that they are particularly interested in technologies that could be applied to adapting their content to their audience, whether for editorial or accessibility reasons.