Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3411764.3445409acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Open access

Firefox Voice: An Open and Extensible Voice Assistant Built Upon the Web

Published: 07 May 2021 Publication History

Abstract

Voice assistants are fundamentally changing the way we access information. However, voice assistants still leverage little about the web beyond simple search results. We introduce Firefox Voice, a novel voice assistant built on the open web ecosystem with an aim to expand access to information available via voice. Firefox Voice is a browser extension that enables users to use their voice to perform actions such as setting timers, navigating the web, and reading a webpage’s content aloud. Through an iterative development process and use by over 12,000 active users, we find that users see voice as a way to accomplish certain browsing tasks efficiently, but struggle with discovering functionality and frequently discontinue use. We conclude by describing how Firefox Voice enables the development of novel, open web-powered voice-driven experiences.

References

[1]
Ali Abdolrahmani, Ravi Kuber, and Stacy M. Branham. 2018. “Siri Talks at You”: An Empirical Investigation of Voice-Activated Personal Assistant (VAPA) Usage by Individuals Who Are Blind. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility(ASSETS ’18). Association for Computing Machinery, New York, NY, USA, 249–258. https://doi.org/10.1145/3234695.3236344
[2]
Ali Abdolrahmani, Kevin M. Storer, Antony Rishin Mukkath Roy, Ravi Kuber, and Stacy M. Branham. 2020. Blind Leading the Sighted: Drawing Design Insights from Blind Users towards More Productivity-Oriented Voice Interfaces. ACM Trans. Access. Comput. 12, 4, Article 18 (Jan. 2020), 35 pages. https://doi.org/10.1145/3368426
[3]
Tawfiq Ammari, Jofish Kaye, Janice Y. Tsai, and Frank Bentley. 2019. Music, Search, and IoT: How People (Really) Use Voice Assistants. ACM Trans. Comput.-Hum. Interact. 26, 3 (April 2019), 17:1–17:28. https://doi.org/10.1145/3311956
[4]
Matthew P. Aylett, Benjamin R. Cowan, and Leigh Clark. 2019. Siri, Echo and Performance: You Have to Suffer Darling. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems(CHI EA ’19). ACM, New York, NY, USA, alt08:1–alt08:10. https://doi.org/10.1145/3290607.3310422
[5]
Marcos Baez, Florian Daniel, and Fabio Casati. 2020. Conversational Web Interaction: Proposal of a Dialog-Based Natural Language Interaction Paradigm for the Web. In Chatbot Research and Design, Asbjørn Følstad, Theo Araujo, Symeon Papadopoulos, Effie Lai-Chong Law, Ole-Christoffer Granmo, Ewa Luger, and Petter Bae Brandtzaeg (Eds.). Springer International Publishing, Cham, 94–110.
[6]
Erin Beneteau, Ashley Boone, Yuxing Wu, Julie A. Kientz, Jason Yip, and Alexis Hiniker. 2020. Parenting with Alexa: Exploring the Introduction of Smart Speakers on Family Dynamics. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376344
[7]
Frank Bentley, Chris Luvogt, Max Silverman, Rushani Wirasinghe, Brooke White, and Danielle Lottridge. 2018. Understanding the Long-Term Use of Smart Speaker Assistants. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 3, Article 91 (Sept. 2018), 24 pages. https://doi.org/10.1145/3264901
[8]
Yevgen Borodin, Jeffrey P. Bigham, Glenn Dausch, and I. V. Ramakrishnan. 2010. More than Meets the Eye: A Survey of Screen-Reader Browsing Strategies. In Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A)(W4A ’10). Association for Computing Machinery, New York, NY, USA, Article 13, 10 pages. https://doi.org/10.1145/1805986.1806005
[9]
Stacy M. Branham and Antony Rishin Mukkath Roy. 2019. Reading Between the Guidelines: How Commercial Voice Assistant Guidelines Hinder Accessibility for Blind Users. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility(ASSETS ’19). Association for Computing Machinery, New York, NY, USA, 446–458. https://doi.org/10.1145/3308561.3353797
[10]
Bret Kinsella. 2020. Jovo v3 Launches with Support for More Platforms, More Devices, and Custom App Experiences. https://voicebot.ai/2020/02/28/jovo-v3-launches-with-support-for-more-platforms-more-devices-and-custom-app-experiences/
[11]
Julia Cambre, Jessica Colnago, Jim Maddock, Janice Tsai, and Jofish Kaye. 2020. Choice of Voices: A Large-Scale Evaluation of Text-to-Speech Voice Quality for Long-Form Content. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376789
[12]
Julia Cambre and Chinmay Kulkarni. 2019. One Voice Fits All? Social Implications and Research Challenges of Designing Voices for Smart Devices. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 223 (Nov. 2019), 19 pages. https://doi.org/10.1145/3359325
[13]
Julia Cambre and Chinmay Kulkarni. 2020. Methods and Tools for Prototyping Voice Interfaces. In Proceedings of the 2nd Conference on Conversational User Interfaces(CUI ’20). Association for Computing Machinery, New York, NY, USA, Article 43, 4 pages. https://doi.org/10.1145/3405755.3406148
[14]
Julia Cambre, Ying Liu, Rebecca E. Taylor, and Chinmay Kulkarni. 2019. Vitro: Designing a Voice Assistant for the Scientific Lab Workplace. In Proceedings of the 2019 on Designing Interactive Systems Conference(DIS ’19). ACM, New York, NY, USA, 1531–1542. https://doi.org/10.1145/3322276.3322298
[15]
Giovanni Campagna, Rakesh Ramesh, Silei Xu, Michael Fischer, and Monica S. Lam. 2017. Almond: The Architecture of an Open, Crowdsourced, Privacy-Preserving, Programmable Virtual Assistant. In Proceedings of the 26th International Conference on World Wide Web(WWW ’17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 341–350. https://doi.org/10.1145/3038912.3052562
[16]
Giovanni Campagna, Silei Xu, Mehrad Moradshahi, Richard Socher, and Monica S. Lam. 2019. Genie: A Generator of Natural Language Semantic Parsers for Virtual Assistant Commands. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI 2019). Association for Computing Machinery, New York, NY, USA, 394–410. https://doi.org/10.1145/3314221.3314594
[17]
Minsuk Chang, Anh Truong, Oliver Wang, Maneesh Agrawala, and Juho Kim. 2019. How to Design Voice Based Navigation for How-To Videos. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3290605.3300931
[18]
Minji Cho, Sang-su Lee, and Kun-Pyo Lee. 2019. Once a Kind Friend Is Now a Thing: Understanding How Conversational Agents at Home Are Forgotten. In Proceedings of the 2019 on Designing Interactive Systems Conference(DIS ’19). Association for Computing Machinery, New York, NY, USA, 1557–1569. https://doi.org/10.1145/3322276.3322332
[19]
Leigh Clark, Philip Doyle, Diego Garaialde, Emer Gilmartin, Stephan Schlögl, Jens Edlund, Matthew Aylett, João Cabral, Cosmin Munteanu, Justin Edwards, and Benjamin R Cowan. 2019. The State of Speech in HCI: Trends, Themes and Challenges. Interacting with Computers 31, 4 (Sept. 2019), 349–371. https://doi.org/10.1093/iwc/iwz016 arXiv:https://academic.oup.com/iwc/article-pdf/31/4/349/33525046/iwz016.pdf
[20]
Eric Corbett and Astrid Weber. 2016. What Can I Say?: Addressing User Experience Challenges of a Mobile Voice User Interface for Accessibility. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services(MobileHCI ’16). ACM, New York, NY, USA, 72–82. https://doi.org/10.1145/2935334.2935386
[21]
Benjamin R Cowan, Nadia Pantidi, David Coyle, Kellie Morrissey, Peter Clarke, Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. ”What Can I Help You with?”: Infrequent Users’ Experiences of Intelligent Personal Assistants. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services(MobileHCI ’17). ACM, New York, NY, USA, 43:1–43:12. https://doi.org/10.1145/3098279.3098539
[22]
Justin Cranshaw, Emad Elwany, Todd Newman, Rafal Kocielnik, Bowen Yu, Sandeep Soni, Jaime Teevan, and Andrés Monroy-Hernández. 2017. Calendar.Help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems(CHI ’17). Association for Computing Machinery, New York, NY, USA, 2382–2393. https://doi.org/10.1145/3025453.3025780
[23]
Andreea Danielescu. 2020. Eschewing Gender Stereotypes in Voice Assistants to Promote Inclusion. In Proceedings of the 2nd Conference on Conversational User Interfaces(CUI ’20). Association for Computing Machinery, New York, NY, USA, Article 46, 3 pages. https://doi.org/10.1145/3405755.3406151
[24]
Nicola Dell, Vidya Vaidyanathan, Indrani Medhi, Edward Cutrell, and William Thies. 2012. ”Yours Is Better!”: Participant Response Bias in HCI. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’12). Association for Computing Machinery, New York, NY, USA, 1321–1330. https://doi.org/10.1145/2207676.2208589
[25]
Philip R. Doyle, Justin Edwards, Odile Dumbleton, Leigh Clark, and Benjamin R. Cowan. 2019. Mapping Perceptions of Humanness in Intelligent Personal Assistant Interaction. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services(MobileHCI ’19). Association for Computing Machinery, New York, NY, USA, Article 5, 12 pages. https://doi.org/10.1145/3338286.3340116
[26]
Dustin Coates. 2020. 5 Voice Search Trends to Look out For.
[27]
C. Ailie Fraser, Julia M. Markel, N. James Basa, Mira Dontcheva, and Scott Klemmer. 2019. ReMap: Multimodal Help-Seeking. In The Adjunct Publication of the 32nd Annual ACM Symposium on User Interface Software and Technology(UIST ’19). Association for Computing Machinery, New York, NY, USA, 96–98. https://doi.org/10.1145/3332167.3356884
[28]
Frank Gillett. 2020. Getting Consumers Beyond Simple Tasks On Smart Speakers Is Challenging. https://go.forrester.com/blogs/getting-consumers-beyond-simple-tasks-on-smart-speakers-is-challenging/
[29]
Global Web Index. 2018. Voice Search: A Deep-Dive into Consumer Uptake of the Voice Assistant Technology. Technical Report. GlobalWebIndex. https://www.globalwebindex.com/reports/voice-search-report
[30]
Google. 2016. Google App Voice Search Insights. Technical Report. Google. https://www.thinkwithgoogle.com/consumer-insights/consumer-trends/google-app-voice-search/
[31]
Ido Guy. 2016. Searching by Talking: Analysis of Voice Queries on Mobile Web Search. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’16). Association for Computing Machinery, New York, NY, USA, 35–44. https://doi.org/10.1145/2911451.2911525
[32]
Danula Hettiachchi, Zhanna Sarsenbayeva, Fraser Allison, Niels van Berkel, Tilman Dingler, Gabriele Marini, Vassilis Kostakos, and Jorge Goncalves. 2020. ”Hi! I Am the Crowd Tasker” Crowdsourcing through Digital Voice Assistants. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376320
[33]
Danula Hettiachchi, Niels van Berkel, Tilman Dingler, Fraser Allison, Vassilis Kostakos, and Jorge Goncalves. 2019. Enabling Creative Crowd Work through Smart Speakers. In Workshop on Designing Crowd-Powered Creativity Support Systems. CHI ’19 Workshop, 1–5. http://www.jorgegoncalves.com/docs/chiea19c.pdf
[34]
Internet Live Stats. 2020. The Total Number of Websites. https://www.internetlivestats.com/total-number-of-websites/
[35]
Arnav Kapur, Shreyas Kapur, and Pattie Maes. 2018. AlterEgo: A Personalized Wearable Silent Speech Interface. In 23rd International Conference on Intelligent User Interfaces(IUI ’18). Association for Computing Machinery, New York, NY, USA, 43–53. https://doi.org/10.1145/3172944.3172977
[36]
Yea-Seul Kim, Mira Dontcheva, Eytan Adar, and Jessica Hullman. 2019. Vocal Shortcuts for Creative Experts. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3290605.3300562
[37]
Allison Koenecke, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R. Rickford, Dan Jurafsky, and Sharad Goel. 2020. Racial Disparities in Automated Speech Recognition. Proceedings of the National Academy of Sciences 117, 14(2020), 7684–7689. https://doi.org/10.1073/pnas.1915768117
[38]
Gierad P. Laput, Mira Dontcheva, Gregg Wilensky, Walter Chang, Aseem Agarwala, Jason Linder, and Eytan Adar. 2013. PixelTone: A Multimodal Interface for Image Editing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’13). Association for Computing Machinery, New York, NY, USA, 2185–2194. https://doi.org/10.1145/2470654.2481301
[39]
Josephine Lau, Benjamin Zimmerman, and Florian Schaub. 2018. Alexa, Are You Listening?: Privacy Perceptions, Concerns and Privacy-Seeking Behaviors with Smart Speakers. Proc. ACM Hum.-Comput. Interact. 2, CSCW (Nov. 2018), 102:1–102:31. https://doi.org/10.1145/3274371
[40]
Tessa Lau, Julian Cerruti, Guillermo Manzato, Mateo Bengualid, Jeffrey P. Bigham, and Jeffrey Nichols. 2010. A Conversational Interface to Web Automation. In Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology(UIST ’10). Association for Computing Machinery, New York, NY, USA, 229–238. https://doi.org/10.1145/1866029.1866067
[41]
Jaejun Lee, Raphael Tang, and Jimmy Lin. 2019. Universal Voice-Enabled User Interfaces Using JavaScript. In Proceedings of the 24th International Conference on Intelligent User Interfaces: Companion(IUI ’19). Association for Computing Machinery, New York, NY, USA, 81–82. https://doi.org/10.1145/3308557.3308693
[42]
Gilly Leshed, Eben M. Haber, Tara Matthews, and Tessa Lau. 2008. CoScripter: Automating & Sharing How-to Knowledge in the Enterprise. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’08). Association for Computing Machinery, New York, NY, USA, 1719–1728. https://doi.org/10.1145/1357054.1357323
[43]
Ewa Luger and Abigail Sellen. 2016. ”Like Having a Really Bad PA”: The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems(CHI ’16). Association for Computing Machinery, New York, NY, USA, 5286–5297. https://doi.org/10.1145/2858036.2858288
[44]
Xiao Ma and Ariel Liu. 2020. Challenges in Supporting Exploratory Search through Voice Assistants. In Proceedings of the 2nd Conference on Conversational User Interfaces(CUI ’20). Association for Computing Machinery, New York, NY, USA, Article 47, 3 pages. https://doi.org/10.1145/3405755.3406152
[45]
Rishabh Mehrotra, A Hassan Awadallah, and Imed Zitouni. 2017. Hey Cortana! Exploring the Use Cases of a Desktop Based Digital Assistant. In SIGIR 1st International Workshop on Conversational Approaches to Information Retrieval (CAIR’17). SIGIR 1st International Workshop on Conversational Approaches to Information Retrieval (CAIR’17), 1–5. https://rishabhmehrotra.com/CAIR17-cortana.pdf
[46]
Sarah Mennicken, Ruth Brillman, Jennifer Thom, and Henriette Cramer. 2018. Challenges and Methods in Design of Domain-Specific Voice Assistants. In 2018 AAAI Spring Symposium Series. 2018 AAAI Spring Symposium Series, 1–5. https://doi.org/10.21437/Interspeech.2017-1746
[47]
Roger K. Moore. 2017. Is Spoken Language All-or-Nothing? Implications for Future Speech-Based Human-Machine Interaction. In Dialogues with Social Robots: Enablements, Analyses, and Evaluation, Kristiina Jokinen and Graham Wilcock (Eds.). Springer Singapore, Singapore, 281–291. https://doi.org/10.1007/978-981-10-2585-3_22
[48]
C. Murad, C. Munteanu, B. R. Cowan, and L. Clark. 2019. Revolution or Evolution? Speech Interaction and HCI Design Guidelines. IEEE Pervasive Computing 18, 2 (April 2019), 33–45. https://doi.org/10.1109/MPRV.2019.2906991
[49]
Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for How Users Overcome Obstacles in Voice User Interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). ACM, New York, NY, USA, 6:1–6:7. https://doi.org/10.1145/3173574.3173580
[50]
Clifford Nass, Jonathan Steuer, and Ellen R. Tauber. 1994. Computers Are Social Actors. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’94). ACM, New York, NY, USA, 72–78. https://doi.org/10.1145/191666.191703
[51]
Elnaz Nouri, Robert Sim, Adam Fourney, and Ryen W. White. 2020. Proactive Suggestion Generation: Data and Methods for Stepwise Task Assistance. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’20). Association for Computing Machinery, New York, NY, USA, 1585–1588. https://doi.org/10.1145/3397271.3401272
[52]
NPR and Edison Research. 2020. The Smart Audio Report (Winter 2019). Technical Report. NPR and Edison Research. https://www.nationalpublicmedia.com/insights/reports/smart-audio-report/
[53]
Emmi Parviainen and Marie Louise Juul Søndergaard. 2020. Experiential Qualities of Whispering with Voice Assistants. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376187
[54]
Randy Pausch and James H Leatherby. 1991. An Empirical Study: Adding Voice Input to a Graphical Editor. In Journal of the American Voice Input/Output Society. Citeseer.
[55]
Martin Porcheron, Joel E. Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice Interfaces in Everyday Life. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3174214
[56]
Alisha Pradhan, Leah Findlater, and Amanda Lazar. 2019. ”Phantom Friend” or ”Just a Box with Information”: Personification and Ontological Categorization of Smart Speaker-Based Voice Assistants by Older Adults. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 214 (Nov. 2019), 21 pages. https://doi.org/10.1145/3359316
[57]
Alisha Pradhan, Amanda Lazar, and Leah Findlater. 2020. Use of Intelligent Voice Assistants by Older Adults with Low Technology Use. ACM Transactions on Computer-Human Interaction 27, 4, Article 31 (Sept. 2020), 27 pages. https://doi.org/10.1145/3373759
[58]
Alisha Pradhan, Kanika Mehta, and Leah Findlater. 2018. ”Accessibility Came by Accident”: Use of Voice-Controlled Intelligent Personal Assistants by People with Disabilities. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3174033
[59]
A Purington, J G Taft, S Sannon, N N Bazarova, and S H Taylor. 2017. ”Alexa Is My New BFF”: Social Roles, User Satisfaction, and Personification of the Amazon Echo. Conference on Human Factors in Computing Systems - Proceedings Part F1276 (2017), 2853–2859. https://doi.org/10.1145/3027063.3053246
[60]
Sarah Perez. 2019. Google Assistant Actions up 2.5x in 2018 to Reach 4,253 in the US. https://techcrunch.com/2019/02/18/google-assistant-actions-up-2-5x-in-2018-to-reach-4253-in-the-u-s/
[61]
Ritam Jyoti Sarmah, Yunpeng Ding, Di Wang, Cheuk Yin Phipson Lee, Toby Jia-Jun Li, and Xiang ’Anthony’ Chen. 2020. Geno: A Developer Tool for Authoring Multimodal Interaction on Existing Web Applications. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology(UIST ’20). Association for Computing Machinery, New York, NY, USA, 1169–1181. https://doi.org/10.1145/3379337.3415848
[62]
Johan Schalkwyk, Doug Beeferman, Françoise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Kamvar, and Brian Strope. 2010. “Your Word Is My Command”: Google Search by Voice: A Case Study. In Advances in Speech Recognition. Springer, 61–90. https://link.springer.com/chapter/10.1007/978-1-4419-5951-5_4
[63]
S. Schlögl, G. Chollet, M. Garschall, M. Tscheligi, and G. Legouverneur. 2013. Exploring Voice User Interfaces for Seniors. In Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments(PETRA ’13). Association for Computing Machinery, New York, NY, USA, Article 52, 2 pages. https://doi.org/10.1145/2504335.2504391
[64]
Alex Sciuto, Arnita Saini, Jodi Forlizzi, and Jason I. Hong. 2018. ”Hey Alexa, What’s Up?”: A Mixed-Methods Studies of In-Home Conversational Agent Usage. In Proceedings of the 2018 Designing Interactive Systems Conference(DIS ’18). Association for Computing Machinery, New York, NY, USA, 857–868. https://doi.org/10.1145/3196709.3196772
[65]
Rob Semmens, Nikolas Martelaro, Pushyami Kaveti, Simon Stent, and Wendy Ju. 2019. Is Now A Good Time? An Empirical Study of Vehicle-Driver Communication Timing. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300867
[66]
Ben Shneiderman. 2000. The Limits of Speech Recognition. Commun. ACM 43, 9 (Sept. 2000), 63–65. https://doi.org/10.1145/348941.348990
[67]
Marie Louise Juul Søndergaard and Lone Koefoed Hansen. 2018. Intimate Futures: Staying with the Trouble of Digital Personal Assistants through Design Fiction. In Proceedings of the 2018 Designing Interactive Systems Conference(DIS ’18). Association for Computing Machinery, New York, NY, USA, 869–880. https://doi.org/10.1145/3196709.3196766
[68]
Aaron Springer and Henriette Cramer. 2018. ”Play PRBLMS”: Identifying and Correcting Less Accessible Content in Voice Interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3173870
[69]
Selina Jeanne Sutton, Paul Foulkes, David Kirk, and Shaun Lawson. 2019. Voice As a Design Material: Sociophonetic Inspired Design Strategies in Human-Computer Interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). ACM, New York, NY, USA, 603:1–603:14. https://doi.org/10.1145/3290605.3300833
[70]
Raphael Tang, Jaejun Lee, Afsaneh Razi, Julia Cambre, Ian Bicking, Jofish Kaye, and Jimmy Lin. 2020. Howl: A Deployed, Open-Source Wake Word Detection System. In Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS). Association for Computational Linguistics, Online, 61–65. https://doi.org/10.18653/v1/2020.nlposs-1.9
[71]
Rachael Tatman and Conner Kasten. 2017. Effects of Talker Dialect, Gender & Race on Accuracy of Bing Speech and YouTube Automatic Captions. In Proc. Interspeech 2017. INTERSPEECH, 934–938. https://www.isca-speech.org/archive/Interspeech_2017/pdfs/1746.PDF
[72]
Janice Y Tsai and Jofish Kaye. 2018. Hey Scout : Designing a Browser-Based Voice Assistant. (2018), 460–462.
[73]
Voicebot.ai. 2019. Smart Speaker Consumer Adoption Report. Technical Report. Voicebot.ai. https://voicebot.ai/wp-content/uploads/2019/03/smart_speaker_consumer_adoption_report_2019.pdf
[74]
VoiceLabs. 2017. 2017 VoiceLabs Voice Report, Executive Summary. Technical Report. VoiceLabs. https://s3-us-west-1.amazonaws.com/voicelabs/report/vl-voice-report-exec-summary_final.pdf
[75]
Alexandra Vtyurina. 2019. Towards Non-Visual Web Search. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval(CHIIR ’19). Association for Computing Machinery, New York, NY, USA, 429–432. https://doi.org/10.1145/3295750.3298976
[76]
Alexandra Vtyurina and Adam Fourney. 2018. Exploring the Role of Conversational Cues in Guided Task Support with Virtual Assistants. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–7. https://doi.org/10.1145/3173574.3173782
[77]
Alexandra Vtyurina, Adam Fourney, Meredith Ringel Morris, Leah Findlater, and Ryen W. White. 2019. Bridging Screen Readers and Voice Assistants for Enhanced Eyes-Free Web Search. In The World Wide Web Conference(WWW ’19). Association for Computing Machinery, New York, NY, USA, 3590–3594. https://doi.org/10.1145/3308558.3314136
[78]
Alexandra Vtyurina, Adam Fourney, Meredith Ringel Morris, Leah Findlater, and Ryen W. White. 2019. VERSE: Bridging Screen Readers and Voice Assistants for Enhanced Eyes-Free Web Search. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility(ASSETS ’19). Association for Computing Machinery, New York, NY, USA, 414–426. https://doi.org/10.1145/3308561.3353773
[79]
Mark West, Rebecca Kraut, and Han Ei Chew. 2019. I’d Blush If I Could: Closing Gender Divides in Digital Skills through Education. Technical Report. UNESCO, EQUALS Skills Coalition. https://unesdoc.unesco.org/ark:/48223/pf0000367416.locale=en
[80]
Jordan Wirfs-Brock, Janice Tsai, Abraham Wallin, and Jofish Kaye. 2019. Listening: It’s Not Just for Audio. https://blog.mozilla.org/ux/2019/12/listening-its-not-just-for-audio/
[81]
Jordan Wirfs-Brock, Janice Tsai, Abraham Wallin, and Jofish Kaye. 2019. People Who Listen to a Lot of Podcasts Really Are Different. https://blog.mozilla.org/ux/2019/12/people-who-listen-to-a-lot-of-podcasts-really-are-different/
[82]
Nicole Yankelovich, Gina-Anne Levow, and Matt Marx. 1995. Designing SpeechActs: Issues in Speech User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’95). ACM Press/Addison-Wesley Publishing Co., USA, 369–376. https://doi.org/10.1145/223904.223952
[83]
John Zimmerman. 2020. Case for a Voice-Internet: Voice Before Conversation. In Proceedings of the 2nd Conference on Conversational User Interfaces(CUI ’20). Association for Computing Machinery, New York, NY, USA, Article 44, 3 pages. https://doi.org/10.1145/3405755.3406149

Cited By

View all
  • (2024)Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older AdultsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596258:2(1-35)Online publication date: 15-May-2024
  • (2024)GPTVoiceTasker: Advancing Multi-step Mobile Task Efficiency Through Dynamic Interface Exploration and LearningProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676356(1-17)Online publication date: 13-Oct-2024
  • (2024)Survey on Personalized Voice Assistant2024 IEEE Recent Advances in Intelligent Computational Systems (RAICS)10.1109/RAICS61201.2024.10690100(1-5)Online publication date: 16-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
May 2021
10862 pages
ISBN:9781450380966
DOI:10.1145/3411764
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2021

Check for updates

Author Tags

  1. CUI
  2. browser extension
  3. conversational user interface
  4. open source
  5. voice assistant

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CHI '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)963
  • Downloads (Last 6 weeks)87
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older AdultsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596258:2(1-35)Online publication date: 15-May-2024
  • (2024)GPTVoiceTasker: Advancing Multi-step Mobile Task Efficiency Through Dynamic Interface Exploration and LearningProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676356(1-17)Online publication date: 13-Oct-2024
  • (2024)Survey on Personalized Voice Assistant2024 IEEE Recent Advances in Intelligent Computational Systems (RAICS)10.1109/RAICS61201.2024.10690100(1-5)Online publication date: 16-May-2024
  • (2024)Harnessing Natural Entity Recognition for Intuitive Spreadsheet Control: A Voice-Driven Google Sheets Extension2024 5th International Conference for Emerging Technology (INCET)10.1109/INCET61516.2024.10593462(1-5)Online publication date: 24-May-2024
  • (2024)Co-designing the integration of voice-based conversational AI and web augmentation to amplify web inclusivityScientific Reports10.1038/s41598-024-66725-314:1Online publication date: 13-Jul-2024
  • (2024)Prospects for Voice-Activated Robotic AssistantsFundamental and Applied Scientific Research in the Development of Agriculture in the Far East (AFE-2022)10.1007/978-3-031-37978-9_32(333-339)Online publication date: 10-Feb-2024
  • (2023)Unrevealing Voice Search Behaviors: Technology Acceptance Model Meets Anthropomorphism in Understanding Consumer Psychology in the U.S. MarketSustainability10.3390/su15231645515:23(16455)Online publication date: 30-Nov-2023
  • (2023)Speaking with My Screen Reader: Using Audio Fictions to Explore Conversational Access to InterfacesProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3608404(1-18)Online publication date: 22-Oct-2023
  • (2023)"This machine is for the aides": Tailoring Voice Assistant Design to Home Health Care WorkProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581118(1-19)Online publication date: 19-Apr-2023
  • (2023)ONYX: Assisting Users in Teaching Natural Language Interfaces Through Multi-Modal Interactive Task LearningProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580964(1-16)Online publication date: 19-Apr-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media