research-article

Open access

Firefox Voice: An Open and Extensible Voice Assistant Built Upon the Web

Authors:

Alex C Williams,

Abraham Wallin,

Chinmay Kulkarni,

Jofish KayeAuthors Info & Claims

CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Article No.: 250, Pages 1 - 18

https://doi.org/10.1145/3411764.3445409

Published: 07 May 2021 Publication History

All formats PDF

Abstract

Voice assistants are fundamentally changing the way we access information. However, voice assistants still leverage little about the web beyond simple search results. We introduce Firefox Voice, a novel voice assistant built on the open web ecosystem with an aim to expand access to information available via voice. Firefox Voice is a browser extension that enables users to use their voice to perform actions such as setting timers, navigating the web, and reading a webpage’s content aloud. Through an iterative development process and use by over 12,000 active users, we find that users see voice as a way to accomplish certain browsing tasks efficiently, but struggle with discovering functionality and frequently discontinue use. We conclude by describing how Firefox Voice enables the development of novel, open web-powered voice-driven experiences.

References

[1]

Ali Abdolrahmani, Ravi Kuber, and Stacy M. Branham. 2018. “Siri Talks at You”: An Empirical Investigation of Voice-Activated Personal Assistant (VAPA) Usage by Individuals Who Are Blind. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility(ASSETS ’18). Association for Computing Machinery, New York, NY, USA, 249–258. https://doi.org/10.1145/3234695.3236344

Digital Library

[2]

Ali Abdolrahmani, Kevin M. Storer, Antony Rishin Mukkath Roy, Ravi Kuber, and Stacy M. Branham. 2020. Blind Leading the Sighted: Drawing Design Insights from Blind Users towards More Productivity-Oriented Voice Interfaces. ACM Trans. Access. Comput. 12, 4, Article 18 (Jan. 2020), 35 pages. https://doi.org/10.1145/3368426

Digital Library

[3]

Tawfiq Ammari, Jofish Kaye, Janice Y. Tsai, and Frank Bentley. 2019. Music, Search, and IoT: How People (Really) Use Voice Assistants. ACM Trans. Comput.-Hum. Interact. 26, 3 (April 2019), 17:1–17:28. https://doi.org/10.1145/3311956

Digital Library

[4]

Matthew P. Aylett, Benjamin R. Cowan, and Leigh Clark. 2019. Siri, Echo and Performance: You Have to Suffer Darling. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems(CHI EA ’19). ACM, New York, NY, USA, alt08:1–alt08:10. https://doi.org/10.1145/3290607.3310422

Digital Library

[5]

Marcos Baez, Florian Daniel, and Fabio Casati. 2020. Conversational Web Interaction: Proposal of a Dialog-Based Natural Language Interaction Paradigm for the Web. In Chatbot Research and Design, Asbjørn Følstad, Theo Araujo, Symeon Papadopoulos, Effie Lai-Chong Law, Ole-Christoffer Granmo, Ewa Luger, and Petter Bae Brandtzaeg (Eds.). Springer International Publishing, Cham, 94–110.

[6]

Erin Beneteau, Ashley Boone, Yuxing Wu, Julie A. Kientz, Jason Yip, and Alexis Hiniker. 2020. Parenting with Alexa: Exploring the Introduction of Smart Speakers on Family Dynamics. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376344

Digital Library

[7]

Frank Bentley, Chris Luvogt, Max Silverman, Rushani Wirasinghe, Brooke White, and Danielle Lottridge. 2018. Understanding the Long-Term Use of Smart Speaker Assistants. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 3, Article 91 (Sept. 2018), 24 pages. https://doi.org/10.1145/3264901

Digital Library

[8]

Yevgen Borodin, Jeffrey P. Bigham, Glenn Dausch, and I. V. Ramakrishnan. 2010. More than Meets the Eye: A Survey of Screen-Reader Browsing Strategies. In Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A)(W4A ’10). Association for Computing Machinery, New York, NY, USA, Article 13, 10 pages. https://doi.org/10.1145/1805986.1806005

Digital Library

[9]

Stacy M. Branham and Antony Rishin Mukkath Roy. 2019. Reading Between the Guidelines: How Commercial Voice Assistant Guidelines Hinder Accessibility for Blind Users. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility(ASSETS ’19). Association for Computing Machinery, New York, NY, USA, 446–458. https://doi.org/10.1145/3308561.3353797

Digital Library

[10]

Bret Kinsella. 2020. Jovo v3 Launches with Support for More Platforms, More Devices, and Custom App Experiences. https://voicebot.ai/2020/02/28/jovo-v3-launches-with-support-for-more-platforms-more-devices-and-custom-app-experiences/

[11]

Julia Cambre, Jessica Colnago, Jim Maddock, Janice Tsai, and Jofish Kaye. 2020. Choice of Voices: A Large-Scale Evaluation of Text-to-Speech Voice Quality for Long-Form Content. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376789

Digital Library

[12]

Julia Cambre and Chinmay Kulkarni. 2019. One Voice Fits All? Social Implications and Research Challenges of Designing Voices for Smart Devices. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 223 (Nov. 2019), 19 pages. https://doi.org/10.1145/3359325

Digital Library

[13]

Julia Cambre and Chinmay Kulkarni. 2020. Methods and Tools for Prototyping Voice Interfaces. In Proceedings of the 2nd Conference on Conversational User Interfaces(CUI ’20). Association for Computing Machinery, New York, NY, USA, Article 43, 4 pages. https://doi.org/10.1145/3405755.3406148

Digital Library

[14]

Julia Cambre, Ying Liu, Rebecca E. Taylor, and Chinmay Kulkarni. 2019. Vitro: Designing a Voice Assistant for the Scientific Lab Workplace. In Proceedings of the 2019 on Designing Interactive Systems Conference(DIS ’19). ACM, New York, NY, USA, 1531–1542. https://doi.org/10.1145/3322276.3322298

Digital Library

[15]

Giovanni Campagna, Rakesh Ramesh, Silei Xu, Michael Fischer, and Monica S. Lam. 2017. Almond: The Architecture of an Open, Crowdsourced, Privacy-Preserving, Programmable Virtual Assistant. In Proceedings of the 26th International Conference on World Wide Web(WWW ’17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 341–350. https://doi.org/10.1145/3038912.3052562

Digital Library

[16]

Giovanni Campagna, Silei Xu, Mehrad Moradshahi, Richard Socher, and Monica S. Lam. 2019. Genie: A Generator of Natural Language Semantic Parsers for Virtual Assistant Commands. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI 2019). Association for Computing Machinery, New York, NY, USA, 394–410. https://doi.org/10.1145/3314221.3314594

Digital Library

[17]

Minsuk Chang, Anh Truong, Oliver Wang, Maneesh Agrawala, and Juho Kim. 2019. How to Design Voice Based Navigation for How-To Videos. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3290605.3300931

Digital Library

[18]

Minji Cho, Sang-su Lee, and Kun-Pyo Lee. 2019. Once a Kind Friend Is Now a Thing: Understanding How Conversational Agents at Home Are Forgotten. In Proceedings of the 2019 on Designing Interactive Systems Conference(DIS ’19). Association for Computing Machinery, New York, NY, USA, 1557–1569. https://doi.org/10.1145/3322276.3322332

Digital Library

[19]

Leigh Clark, Philip Doyle, Diego Garaialde, Emer Gilmartin, Stephan Schlögl, Jens Edlund, Matthew Aylett, João Cabral, Cosmin Munteanu, Justin Edwards, and Benjamin R Cowan. 2019. The State of Speech in HCI: Trends, Themes and Challenges. Interacting with Computers 31, 4 (Sept. 2019), 349–371. https://doi.org/10.1093/iwc/iwz016 arXiv:https://academic.oup.com/iwc/article-pdf/31/4/349/33525046/iwz016.pdf

[20]

Eric Corbett and Astrid Weber. 2016. What Can I Say?: Addressing User Experience Challenges of a Mobile Voice User Interface for Accessibility. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services(MobileHCI ’16). ACM, New York, NY, USA, 72–82. https://doi.org/10.1145/2935334.2935386

Digital Library

[21]

Benjamin R Cowan, Nadia Pantidi, David Coyle, Kellie Morrissey, Peter Clarke, Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. ”What Can I Help You with?”: Infrequent Users’ Experiences of Intelligent Personal Assistants. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services(MobileHCI ’17). ACM, New York, NY, USA, 43:1–43:12. https://doi.org/10.1145/3098279.3098539

Digital Library

[22]

Justin Cranshaw, Emad Elwany, Todd Newman, Rafal Kocielnik, Bowen Yu, Sandeep Soni, Jaime Teevan, and Andrés Monroy-Hernández. 2017. Calendar.Help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems(CHI ’17). Association for Computing Machinery, New York, NY, USA, 2382–2393. https://doi.org/10.1145/3025453.3025780

Digital Library

[23]

Andreea Danielescu. 2020. Eschewing Gender Stereotypes in Voice Assistants to Promote Inclusion. In Proceedings of the 2nd Conference on Conversational User Interfaces(CUI ’20). Association for Computing Machinery, New York, NY, USA, Article 46, 3 pages. https://doi.org/10.1145/3405755.3406151

Digital Library

[24]

Nicola Dell, Vidya Vaidyanathan, Indrani Medhi, Edward Cutrell, and William Thies. 2012. ”Yours Is Better!”: Participant Response Bias in HCI. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’12). Association for Computing Machinery, New York, NY, USA, 1321–1330. https://doi.org/10.1145/2207676.2208589

Digital Library

[25]

Philip R. Doyle, Justin Edwards, Odile Dumbleton, Leigh Clark, and Benjamin R. Cowan. 2019. Mapping Perceptions of Humanness in Intelligent Personal Assistant Interaction. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services(MobileHCI ’19). Association for Computing Machinery, New York, NY, USA, Article 5, 12 pages. https://doi.org/10.1145/3338286.3340116

Digital Library

[26]

Dustin Coates. 2020. 5 Voice Search Trends to Look out For.

[27]

C. Ailie Fraser, Julia M. Markel, N. James Basa, Mira Dontcheva, and Scott Klemmer. 2019. ReMap: Multimodal Help-Seeking. In The Adjunct Publication of the 32nd Annual ACM Symposium on User Interface Software and Technology(UIST ’19). Association for Computing Machinery, New York, NY, USA, 96–98. https://doi.org/10.1145/3332167.3356884

Digital Library

[28]

Frank Gillett. 2020. Getting Consumers Beyond Simple Tasks On Smart Speakers Is Challenging. https://go.forrester.com/blogs/getting-consumers-beyond-simple-tasks-on-smart-speakers-is-challenging/

[29]

Global Web Index. 2018. Voice Search: A Deep-Dive into Consumer Uptake of the Voice Assistant Technology. Technical Report. GlobalWebIndex. https://www.globalwebindex.com/reports/voice-search-report

[30]

Google. 2016. Google App Voice Search Insights. Technical Report. Google. https://www.thinkwithgoogle.com/consumer-insights/consumer-trends/google-app-voice-search/

[31]

Ido Guy. 2016. Searching by Talking: Analysis of Voice Queries on Mobile Web Search. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’16). Association for Computing Machinery, New York, NY, USA, 35–44. https://doi.org/10.1145/2911451.2911525

Digital Library

[32]

Danula Hettiachchi, Zhanna Sarsenbayeva, Fraser Allison, Niels van Berkel, Tilman Dingler, Gabriele Marini, Vassilis Kostakos, and Jorge Goncalves. 2020. ”Hi! I Am the Crowd Tasker” Crowdsourcing through Digital Voice Assistants. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376320

Digital Library

[33]

Danula Hettiachchi, Niels van Berkel, Tilman Dingler, Fraser Allison, Vassilis Kostakos, and Jorge Goncalves. 2019. Enabling Creative Crowd Work through Smart Speakers. In Workshop on Designing Crowd-Powered Creativity Support Systems. CHI ’19 Workshop, 1–5. http://www.jorgegoncalves.com/docs/chiea19c.pdf

[34]

Internet Live Stats. 2020. The Total Number of Websites. https://www.internetlivestats.com/total-number-of-websites/

[35]

Arnav Kapur, Shreyas Kapur, and Pattie Maes. 2018. AlterEgo: A Personalized Wearable Silent Speech Interface. In 23rd International Conference on Intelligent User Interfaces(IUI ’18). Association for Computing Machinery, New York, NY, USA, 43–53. https://doi.org/10.1145/3172944.3172977

Digital Library

[36]

Yea-Seul Kim, Mira Dontcheva, Eytan Adar, and Jessica Hullman. 2019. Vocal Shortcuts for Creative Experts. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3290605.3300562

Digital Library

[37]

Allison Koenecke, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R. Rickford, Dan Jurafsky, and Sharad Goel. 2020. Racial Disparities in Automated Speech Recognition. Proceedings of the National Academy of Sciences 117, 14(2020), 7684–7689. https://doi.org/10.1073/pnas.1915768117

[38]

Gierad P. Laput, Mira Dontcheva, Gregg Wilensky, Walter Chang, Aseem Agarwala, Jason Linder, and Eytan Adar. 2013. PixelTone: A Multimodal Interface for Image Editing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’13). Association for Computing Machinery, New York, NY, USA, 2185–2194. https://doi.org/10.1145/2470654.2481301

Digital Library

[39]

Josephine Lau, Benjamin Zimmerman, and Florian Schaub. 2018. Alexa, Are You Listening?: Privacy Perceptions, Concerns and Privacy-Seeking Behaviors with Smart Speakers. Proc. ACM Hum.-Comput. Interact. 2, CSCW (Nov. 2018), 102:1–102:31. https://doi.org/10.1145/3274371

Digital Library

[40]

Tessa Lau, Julian Cerruti, Guillermo Manzato, Mateo Bengualid, Jeffrey P. Bigham, and Jeffrey Nichols. 2010. A Conversational Interface to Web Automation. In Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology(UIST ’10). Association for Computing Machinery, New York, NY, USA, 229–238. https://doi.org/10.1145/1866029.1866067

Digital Library

[41]

Jaejun Lee, Raphael Tang, and Jimmy Lin. 2019. Universal Voice-Enabled User Interfaces Using JavaScript. In Proceedings of the 24th International Conference on Intelligent User Interfaces: Companion(IUI ’19). Association for Computing Machinery, New York, NY, USA, 81–82. https://doi.org/10.1145/3308557.3308693

Digital Library

[42]

Gilly Leshed, Eben M. Haber, Tara Matthews, and Tessa Lau. 2008. CoScripter: Automating & Sharing How-to Knowledge in the Enterprise. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’08). Association for Computing Machinery, New York, NY, USA, 1719–1728. https://doi.org/10.1145/1357054.1357323

Digital Library

[43]

Ewa Luger and Abigail Sellen. 2016. ”Like Having a Really Bad PA”: The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems(CHI ’16). Association for Computing Machinery, New York, NY, USA, 5286–5297. https://doi.org/10.1145/2858036.2858288

Digital Library

[44]

Xiao Ma and Ariel Liu. 2020. Challenges in Supporting Exploratory Search through Voice Assistants. In Proceedings of the 2nd Conference on Conversational User Interfaces(CUI ’20). Association for Computing Machinery, New York, NY, USA, Article 47, 3 pages. https://doi.org/10.1145/3405755.3406152

Digital Library

[45]

Rishabh Mehrotra, A Hassan Awadallah, and Imed Zitouni. 2017. Hey Cortana! Exploring the Use Cases of a Desktop Based Digital Assistant. In SIGIR 1st International Workshop on Conversational Approaches to Information Retrieval (CAIR’17). SIGIR 1st International Workshop on Conversational Approaches to Information Retrieval (CAIR’17), 1–5. https://rishabhmehrotra.com/CAIR17-cortana.pdf

[46]

Sarah Mennicken, Ruth Brillman, Jennifer Thom, and Henriette Cramer. 2018. Challenges and Methods in Design of Domain-Specific Voice Assistants. In 2018 AAAI Spring Symposium Series. 2018 AAAI Spring Symposium Series, 1–5. https://doi.org/10.21437/Interspeech.2017-1746

[47]

Roger K. Moore. 2017. Is Spoken Language All-or-Nothing? Implications for Future Speech-Based Human-Machine Interaction. In Dialogues with Social Robots: Enablements, Analyses, and Evaluation, Kristiina Jokinen and Graham Wilcock (Eds.). Springer Singapore, Singapore, 281–291. https://doi.org/10.1007/978-981-10-2585-3_22

[48]

C. Murad, C. Munteanu, B. R. Cowan, and L. Clark. 2019. Revolution or Evolution? Speech Interaction and HCI Design Guidelines. IEEE Pervasive Computing 18, 2 (April 2019), 33–45. https://doi.org/10.1109/MPRV.2019.2906991

Digital Library

[49]

Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for How Users Overcome Obstacles in Voice User Interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). ACM, New York, NY, USA, 6:1–6:7. https://doi.org/10.1145/3173574.3173580

Digital Library

[50]

Clifford Nass, Jonathan Steuer, and Ellen R. Tauber. 1994. Computers Are Social Actors. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’94). ACM, New York, NY, USA, 72–78. https://doi.org/10.1145/191666.191703

Digital Library

[51]

Elnaz Nouri, Robert Sim, Adam Fourney, and Ryen W. White. 2020. Proactive Suggestion Generation: Data and Methods for Stepwise Task Assistance. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’20). Association for Computing Machinery, New York, NY, USA, 1585–1588. https://doi.org/10.1145/3397271.3401272

Digital Library

[52]

NPR and Edison Research. 2020. The Smart Audio Report (Winter 2019). Technical Report. NPR and Edison Research. https://www.nationalpublicmedia.com/insights/reports/smart-audio-report/

[53]

Emmi Parviainen and Marie Louise Juul Søndergaard. 2020. Experiential Qualities of Whispering with Voice Assistants. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376187

Digital Library

[54]

Randy Pausch and James H Leatherby. 1991. An Empirical Study: Adding Voice Input to a Graphical Editor. In Journal of the American Voice Input/Output Society. Citeseer.

[55]

Martin Porcheron, Joel E. Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice Interfaces in Everyday Life. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3174214

Digital Library

[56]

Alisha Pradhan, Leah Findlater, and Amanda Lazar. 2019. ”Phantom Friend” or ”Just a Box with Information”: Personification and Ontological Categorization of Smart Speaker-Based Voice Assistants by Older Adults. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 214 (Nov. 2019), 21 pages. https://doi.org/10.1145/3359316

Digital Library

[57]

Alisha Pradhan, Amanda Lazar, and Leah Findlater. 2020. Use of Intelligent Voice Assistants by Older Adults with Low Technology Use. ACM Transactions on Computer-Human Interaction 27, 4, Article 31 (Sept. 2020), 27 pages. https://doi.org/10.1145/3373759

Digital Library

[58]

Alisha Pradhan, Kanika Mehta, and Leah Findlater. 2018. ”Accessibility Came by Accident”: Use of Voice-Controlled Intelligent Personal Assistants by People with Disabilities. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3174033

Digital Library

[59]

A Purington, J G Taft, S Sannon, N N Bazarova, and S H Taylor. 2017. ”Alexa Is My New BFF”: Social Roles, User Satisfaction, and Personification of the Amazon Echo. Conference on Human Factors in Computing Systems - Proceedings Part F1276 (2017), 2853–2859. https://doi.org/10.1145/3027063.3053246

Digital Library

[60]

Sarah Perez. 2019. Google Assistant Actions up 2.5x in 2018 to Reach 4,253 in the US. https://techcrunch.com/2019/02/18/google-assistant-actions-up-2-5x-in-2018-to-reach-4253-in-the-u-s/

[61]

Ritam Jyoti Sarmah, Yunpeng Ding, Di Wang, Cheuk Yin Phipson Lee, Toby Jia-Jun Li, and Xiang ’Anthony’ Chen. 2020. Geno: A Developer Tool for Authoring Multimodal Interaction on Existing Web Applications. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology(UIST ’20). Association for Computing Machinery, New York, NY, USA, 1169–1181. https://doi.org/10.1145/3379337.3415848

Digital Library

[62]

Johan Schalkwyk, Doug Beeferman, Françoise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Kamvar, and Brian Strope. 2010. “Your Word Is My Command”: Google Search by Voice: A Case Study. In Advances in Speech Recognition. Springer, 61–90. https://link.springer.com/chapter/10.1007/978-1-4419-5951-5_4

[63]

S. Schlögl, G. Chollet, M. Garschall, M. Tscheligi, and G. Legouverneur. 2013. Exploring Voice User Interfaces for Seniors. In Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments(PETRA ’13). Association for Computing Machinery, New York, NY, USA, Article 52, 2 pages. https://doi.org/10.1145/2504335.2504391

Digital Library

[64]

Alex Sciuto, Arnita Saini, Jodi Forlizzi, and Jason I. Hong. 2018. ”Hey Alexa, What’s Up?”: A Mixed-Methods Studies of In-Home Conversational Agent Usage. In Proceedings of the 2018 Designing Interactive Systems Conference(DIS ’18). Association for Computing Machinery, New York, NY, USA, 857–868. https://doi.org/10.1145/3196709.3196772

Digital Library

[65]

Rob Semmens, Nikolas Martelaro, Pushyami Kaveti, Simon Stent, and Wendy Ju. 2019. Is Now A Good Time? An Empirical Study of Vehicle-Driver Communication Timing. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300867

Digital Library

[66]

Ben Shneiderman. 2000. The Limits of Speech Recognition. Commun. ACM 43, 9 (Sept. 2000), 63–65. https://doi.org/10.1145/348941.348990

Digital Library

[67]

Marie Louise Juul Søndergaard and Lone Koefoed Hansen. 2018. Intimate Futures: Staying with the Trouble of Digital Personal Assistants through Design Fiction. In Proceedings of the 2018 Designing Interactive Systems Conference(DIS ’18). Association for Computing Machinery, New York, NY, USA, 869–880. https://doi.org/10.1145/3196709.3196766

Digital Library

[68]

Aaron Springer and Henriette Cramer. 2018. ”Play PRBLMS”: Identifying and Correcting Less Accessible Content in Voice Interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3173870

Digital Library

[69]

Selina Jeanne Sutton, Paul Foulkes, David Kirk, and Shaun Lawson. 2019. Voice As a Design Material: Sociophonetic Inspired Design Strategies in Human-Computer Interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). ACM, New York, NY, USA, 603:1–603:14. https://doi.org/10.1145/3290605.3300833

Digital Library

[70]

Raphael Tang, Jaejun Lee, Afsaneh Razi, Julia Cambre, Ian Bicking, Jofish Kaye, and Jimmy Lin. 2020. Howl: A Deployed, Open-Source Wake Word Detection System. In Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS). Association for Computational Linguistics, Online, 61–65. https://doi.org/10.18653/v1/2020.nlposs-1.9

[71]

Rachael Tatman and Conner Kasten. 2017. Effects of Talker Dialect, Gender & Race on Accuracy of Bing Speech and YouTube Automatic Captions. In Proc. Interspeech 2017. INTERSPEECH, 934–938. https://www.isca-speech.org/archive/Interspeech_2017/pdfs/1746.PDF

[72]

Janice Y Tsai and Jofish Kaye. 2018. Hey Scout : Designing a Browser-Based Voice Assistant. (2018), 460–462.

[73]

Voicebot.ai. 2019. Smart Speaker Consumer Adoption Report. Technical Report. Voicebot.ai. https://voicebot.ai/wp-content/uploads/2019/03/smart_speaker_consumer_adoption_report_2019.pdf

[74]

VoiceLabs. 2017. 2017 VoiceLabs Voice Report, Executive Summary. Technical Report. VoiceLabs. https://s3-us-west-1.amazonaws.com/voicelabs/report/vl-voice-report-exec-summary_final.pdf

[75]

Alexandra Vtyurina. 2019. Towards Non-Visual Web Search. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval(CHIIR ’19). Association for Computing Machinery, New York, NY, USA, 429–432. https://doi.org/10.1145/3295750.3298976

Digital Library

[76]

Alexandra Vtyurina and Adam Fourney. 2018. Exploring the Role of Conversational Cues in Guided Task Support with Virtual Assistants. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–7. https://doi.org/10.1145/3173574.3173782

Digital Library

[77]

Alexandra Vtyurina, Adam Fourney, Meredith Ringel Morris, Leah Findlater, and Ryen W. White. 2019. Bridging Screen Readers and Voice Assistants for Enhanced Eyes-Free Web Search. In The World Wide Web Conference(WWW ’19). Association for Computing Machinery, New York, NY, USA, 3590–3594. https://doi.org/10.1145/3308558.3314136

Digital Library

[78]

Alexandra Vtyurina, Adam Fourney, Meredith Ringel Morris, Leah Findlater, and Ryen W. White. 2019. VERSE: Bridging Screen Readers and Voice Assistants for Enhanced Eyes-Free Web Search. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility(ASSETS ’19). Association for Computing Machinery, New York, NY, USA, 414–426. https://doi.org/10.1145/3308561.3353773

Digital Library

[79]

Mark West, Rebecca Kraut, and Han Ei Chew. 2019. I’d Blush If I Could: Closing Gender Divides in Digital Skills through Education. Technical Report. UNESCO, EQUALS Skills Coalition. https://unesdoc.unesco.org/ark:/48223/pf0000367416.locale=en

[80]

Jordan Wirfs-Brock, Janice Tsai, Abraham Wallin, and Jofish Kaye. 2019. Listening: It’s Not Just for Audio. https://blog.mozilla.org/ux/2019/12/listening-its-not-just-for-audio/

[81]

Jordan Wirfs-Brock, Janice Tsai, Abraham Wallin, and Jofish Kaye. 2019. People Who Listen to a Lot of Podcasts Really Are Different. https://blog.mozilla.org/ux/2019/12/people-who-listen-to-a-lot-of-podcasts-really-are-different/

[82]

Nicole Yankelovich, Gina-Anne Levow, and Matt Marx. 1995. Designing SpeechActs: Issues in Speech User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’95). ACM Press/Addison-Wesley Publishing Co., USA, 369–376. https://doi.org/10.1145/223904.223952

Digital Library

[83]

John Zimmerman. 2020. Case for a Voice-Internet: Voice Before Conversation. In Proceedings of the 2nd Conference on Conversational User Interfaces(CUI ’20). Association for Computing Machinery, New York, NY, USA, Article 44, 3 pages. https://doi.org/10.1145/3405755.3406149

Digital Library

Cited By

Yang ZXu XYao BRogers EZhang SIntille SShara NGao GWang D(2024)Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older AdultsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596258:2(1-35)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3659625
Vu MWang HChen JLi ZZhao SXing ZChen C(2024)GPTVoiceTasker: Advancing Multi-step Mobile Task Efficiency Through Dynamic Interface Exploration and LearningProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676356(1-17)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676356
Srikrishnan SPushparaj ASuresh ANair AGeorge TT R S(2024)Survey on Personalized Voice Assistant2024 IEEE Recent Advances in Intelligent Computational Systems (RAICS)10.1109/RAICS61201.2024.10690100(1-5)Online publication date: 16-May-2024
https://doi.org/10.1109/RAICS61201.2024.10690100
Show More Cited By

Index Terms

Firefox Voice: An Open and Extensible Voice Assistant Built Upon the Web

Index terms have been assigned to the content through auto-classification.

Recommendations

Toward Voice-Assisted Browsers: A Preliminary Study with Firefox Voice
CUI '20: Proceedings of the 2nd Conference on Conversational User Interfaces

Web browsers allow people to find, organize, and manage information on the web. While voice interaction research has evaluated the support of web search, the broader role of voice interactions within the browser have yet to be explored at depth. We ...
Adding XMP support to firefox

XMP (Extensible Metadata Platform) is an RDF-based framework of Adobe Systems Incorporated that supports the embedding of metadata in application files. If it becomes widely used on the web, it will provide a rich source of metadata to semantic web ...
Eye-like Algorithm to produce voice web pages
WAINA '10: Proceedings of the 2010 IEEE 24th International Conference on Advanced Information Networking and Applications Workshops

Nowadays, internet web sites and applications are efficient tools with which users interact by downloading and uploading data through a browser. Same functionality is not available through the phone for all internet web sites and applications. In this ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

May 2021

10862 pages

ISBN:9781450380966

DOI:10.1145/3411764

General Chairs:
Yoshifumi Kitamura
Tohoku University, Japan
,
Aaron Quigley
University of New South Wales, Australia
,
Program Chairs:
Katherine Isbister
University of California Santa Cruz, USA
,
Takeo Igarashi
The University of Tokyo, Japan
,
Publications Chairs:
Pernille Bjørn
University of Copenhagen, Denmark
,
Steven Drucker
Microsoft Research, USA

Copyright © 2021 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2021

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CHI '21

Sponsor:

SIGCHI

CHI '21: CHI Conference on Human Factors in Computing Systems

May 8 - 13, 2021

Yokohama, Japan

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
3,738
Total Downloads

Downloads (Last 12 months)963
Downloads (Last 6 weeks)87

Reflects downloads up to 24 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang ZXu XYao BRogers EZhang SIntille SShara NGao GWang D(2024)Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older AdultsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596258:2(1-35)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3659625
Vu MWang HChen JLi ZZhao SXing ZChen C(2024)GPTVoiceTasker: Advancing Multi-step Mobile Task Efficiency Through Dynamic Interface Exploration and LearningProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676356(1-17)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676356
Srikrishnan SPushparaj ASuresh ANair AGeorge TT R S(2024)Survey on Personalized Voice Assistant2024 IEEE Recent Advances in Intelligent Computational Systems (RAICS)10.1109/RAICS61201.2024.10690100(1-5)Online publication date: 16-May-2024
https://doi.org/10.1109/RAICS61201.2024.10690100
Joglekar PJha DDhorage PThakkar DDhumal O(2024)Harnessing Natural Entity Recognition for Intuitive Spreadsheet Control: A Voice-Driven Google Sheets Extension2024 5th International Conference for Emerging Technology (INCET)10.1109/INCET61516.2024.10593462(1-5)Online publication date: 24-May-2024
https://doi.org/10.1109/INCET61516.2024.10593462
Pucci EPiro LPossaghi IMulfari DMatera M(2024)Co-designing the integration of voice-based conversational AI and web augmentation to amplify web inclusivityScientific Reports10.1038/s41598-024-66725-314:1Online publication date: 13-Jul-2024
https://doi.org/10.1038/s41598-024-66725-3
Andreev D(2024)Prospects for Voice-Activated Robotic AssistantsFundamental and Applied Scientific Research in the Development of Agriculture in the Far East (AFE-2022)10.1007/978-3-031-37978-9_32(333-339)Online publication date: 10-Feb-2024
https://doi.org/10.1007/978-3-031-37978-9_32
Ahn H(2023)Unrevealing Voice Search Behaviors: Technology Acceptance Model Meets Anthropomorphism in Understanding Consumer Psychology in the U.S. MarketSustainability10.3390/su15231645515:23(16455)Online publication date: 30-Nov-2023
https://doi.org/10.3390/su152316455
Phutane MJung CChen NAzenkot S(2023)Speaking with My Screen Reader: Using Audio Fictions to Explore Conversational Access to InterfacesProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3608404(1-18)Online publication date: 22-Oct-2023
https://dl.acm.org/doi/10.1145/3597638.3608404
Bartle VAlbright LDell N(2023)"This machine is for the aides": Tailoring Voice Assistant Design to Home Health Care WorkProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581118(1-19)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581118
Ruoff MMyers BMaedche A(2023)ONYX: Assisting Users in Teaching Natural Language Interfaces Through Multi-Modal Interactive Task LearningProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580964(1-16)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3580964
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten