research-article

Evorus: A Crowd-powered Conversational Assistant Built to Automate Itself Over Time

Authors:

Ting-Hao (Kenneth) Huang,

Joseph Chee Chang,

Jeffrey P. BighamAuthors Info & Claims

CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

Paper No.: 295, Pages 1 - 13

https://doi.org/10.1145/3173574.3173869

Published: 21 April 2018 Publication History

Abstract

Crowd-powered conversational assistants have been shown to be more robust than automated systems, but do so at the cost of higher response latency and monetary costs. A promising direction is to combine the two approaches for high quality, low latency, and low cost solutions. In this paper, we introduce Evorus, a crowd-powered conversational assistant built to automate itself over time by (i) allowing new chatbots to be easily integrated to automate more scenarios, (ii) reusing prior crowd answers, and (iii) learning to automatically approve response candidates. Our 5-month-long deployment with 80 participants and 281 conversations shows that Evorus can automate itself without compromising conversation quality. Crowd-AI architectures have long been proposed as a way to reduce cost and latency for crowd-powered systems; Evorus demonstrates how automation can be introduced successfully in a deployed system. Its architecture allows future researchers to make further innovation on the underlying automated components in the context of a deployed open domain dialog system.

Supplementary Material

suppl.mov (pn2817-file5.mp4)

Supplemental video

Download
2.41 MB

References

[1]

Amazon. 2017. Meet Alexa. (2017). https://www.amazon.com/meet-alexa/b?ie= UTF8&node=16067214011

[2]

Rafael E Banchs and Haizhou Li. 2012. IRIS: a chat-oriented dialogue system based on the vector space model. In Proceedings of the ACL 2012 System Demonstrations. Association for Computational Linguistics, 37--42.

Digital Library

[3]

Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White, and Tom Yeh. 2010. VizWiz: Nearly Real-time Answers to Visual Questions. In Proceedings of the 23Nd Annual ACM Symposium on User Interface Software and Technology (UIST '10). ACM, New York, NY, USA, 333--342.

Digital Library

[4]

Rollo Carpenter. 2006. Cleverbot. (2006). https://www.cleverbot.com/ {Online; accessed 08-March-2017}.

[5]

Joseph Chee Chang, Aniket Kittur, and Nathan Hahn. 2016. Alloy: Clustering with crowds and computation. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 3180--3191.

Digital Library

[6]

Yun-Nung Chen, Dilek Hakkani-Tür, Gökhan Tür, Jianfeng Gao, and Li Deng. 2016. End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding. In INTERSPEECH. 3245--3249.

[7]

Justin Cheng and Michael S Bernstein. 2015. Flock: Hybrid crowd-machine learning classifiers. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. ACM, 600--611.

Digital Library

[8]

Josh Constine. 2017. Amazon rejects AI2's Alexa skill voice-search engine. Will it build one? (May 2017). https://techcrunch.com/2017/05/31/amazon-skillsearch-engine/

[9]

Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR - A Library for Large Linear Classification. (2008). http://www.csie.ntu.edu.tw/~cjlin/liblinear/ The Weka classifier works with version 1.33 of LIBLINEAR.

Digital Library

[10]

Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, and Reynold Xin. 2011. CrowdDB: Answering Queries with Crowdsourcing. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD '11). ACM, New York, NY, USA, 61--72.

Digital Library

[11]

Milica Gasic, Nikola Mrksic, Lina M Rojas-Barahona, Pei-Hao Su, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, and Steve Young. 2016. Dialogue manager domain adaptation using Gaussian process reinforcement learning. arXiv preprint arXiv:1609.02846 (2016).

[12]

Nathan Hahn, Joseph Chang, Ji Eun Kim, and Aniket Kittur. 2016. The Knowledge Accelerator: Big picture thinking in small pieces. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 2258--2270.

Digital Library

[13]

Bo Han and Timothy Baldwin. 2011. Lexical Normalisation of Short Text Messages: Makn Sens a #Twitter. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1 (HLT '11). Association for Computational Linguistics, Stroudsburg, PA, USA, 368--378. http://dl.acm.org/citation.cfm?id=2002472.2002520

Digital Library

[14]

Hangoutsbot. 2017. hangoutsbot/hangoutsbot. (Apr 2017). https://github.com/hangoutsbot/hangoutsbot

[15]

Jessi Hempel. 2015. Facebook Launches M, Its Bold Answer to Siri and Cortana. (Aug 2015). https://www.wired.com/2015/08/facebook-launchesm-new-kind-virtual-assistant/

[16]

Ting-Hao Kenneth Huang, Amos Azaria, and Jeffrey P. Bigham. 2016. InstructableCrowd: Creating IF-THEN Rules via Conversations with the Crowd. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA '16). ACM, New York, NY, USA, 1555--1562.

Digital Library

[17]

Ting-Hao K. Huang and Jeffrey P. Bigham. 2017. A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crowdsourcing. In In Proceedings of The fifth AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2017). AAAI, AAAI.

[18]

Ting-Hao K. Huang, Yun-Nung Chen, and Jeffrey P. Bigham. 2017. Real-time On-Demand Crowd-powered Entity Extraction. In In Proceedings of the 5th Edition Of The Collective Intelligence Conference (CI 2017, oral presentation).

[19]

Ting-Hao Kenneth Huang, Walter S. Lasecki, Amos Azaria, and Jeffrey P. Bigham. 2016. "Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent. In Proceedings of AAAI Conference on Human Computation and Crowdsourcing 2016 (HCOMP 2016). AAAI.

[20]

Ting-Hao Kenneth Huang, Walter S Lasecki, and Jeffrey P Bigham. 2015. Guardian: A Crowd-Powered Spoken Dialog System for Web APIs. In Third AAAI Conference on Human Computation and Crowdsourcing.

[21]

Ece Kamar, Severin Hacker, and Eric Horvitz. 2012. Combining human and machine intelligence in large-scale crowdsourcing. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, 467--474.

Digital Library

[22]

Ece Kamar and Lydia Manikonda. 2017. Complementing the Execution of AI Systems with Human Computation. In AAAI Workshop on Crowdsourcing, Deep Learning and Artificial Intelligence Agents 2017. AAAI.

[23]

G. Laput, W. S. Lasecki, J. Wiese, R. Xiao, J. P. Bigham, and C. Harrison. 2015. Zensors: Adaptive, Rapidly Deployable, Human-Intelligent Sensor Feeds. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 10. http://www.cs.cmu.edu/~jbigham/ pubs/pdfs/2015/zensors.pdf

Digital Library

[24]

Walter S. Lasecki, Phyo Thiha, Yu Zhong, Erin Brady, and Jeffrey P. Bigham. 2013a. Answering Visual Questions with Conversational Crowd Assistants. In Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '13). ACM, New York, NY, USA, Article 18, 8 pages.

Digital Library

[25]

Walter S. Lasecki, Rachel Wesley, Jeffrey Nichols, Anand Kulkarni, James F. Allen, and Jeffrey P. Bigham. 2013b. Chorus: A Crowd-powered Conversational Assistant. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology (UIST '13). ACM, New York, NY, USA, 151--162.

Digital Library

[26]

Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016. A Persona-Based Neural Conversation Model. arXiv preprint arXiv:1603.06155 (2016).

[27]

Xuijun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao, and Asli Celikyilmaz. 2017. End-to-End Task-Completion Neural Dialogue Systems. In Proceedings of The 8th International Joint Conference on Natural Language Processing (IJCNLP 2017). AFNLP.

[28]

Leigh Anne Liu, Chei Hwee Chua, and Günter K Stahl. 2010. Quality of communication experience: definition, measurement, and implications for intercultural negotiations. Journal of Applied Psychology 95, 3 (2010), 469.

[29]

Matthew Lynley. 2016. Make Magic's Assistants Do Almost Anything With $100/Hour And A Text Message. (Jan 2016). https://techcrunch.com/2016/01/05/ make-magics-assistants-do-almost-anything-with100hour-and-a-text-message/

[30]

Cable News Network. 2017. (2017). http://transcripts.cnn.com/TRANSCRIPTS/

[31]

Casey Newton. 2016 (accessed October 24th, 2016). SPEAK, MEMORY: When her best friend died, she rebuilt him using artificial intelligence. https://www.theverge.com/a/luka-artificialintelligence-memorial-roman-mazurenko-bot

[32]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532--1543. http://www.aclweb.org/anthology/D14--1162

[33]

Antoine Raux and Maxine Eskenazi. 2008. Optimizing endpointing thresholds using dialogue features in a spoken dialogue system. In Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue. Association for Computational Linguistics, 1--10.

Digital Library

[34]

Daniela Retelny, Sébastien Robaszkiewicz, Alexandra To, Walter S Lasecki, Jay Patel, Negar Rahmati, Tulsee Doshi, Melissa Valentine, and Michael S Bernstein. 2014. Expert crowdsourcing with flash teams. In Proceedings of the 27th annual ACM symposium on User interface software and technology. ACM, 75--85.

Digital Library

[35]

Alan Ritter, Colin Cherry, and William B Dolan. 2011. Data-driven response generation in social media. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 583--593.

Digital Library

[36]

Erica Sadun and Steve Sande. 2012. Talking to Siri: Learning the Language of Apple's Intelligent Assistant. Que Publishing.

Digital Library

[37]

Akash Das Sarma, Ayush Jain, Arnab Nandi, Aditya Parameswaran, and Jennifer Widom. 2015. Surpassing humans and computers with JELLYBEAN: Crowd-vision-hybrid counting algorithms. In Third AAAI Conference on Human Computation and Crowdsourcing.

[38]

Konrad Scheffler and Steve Young. 2002. Automatic Learning of Dialogue Strategy Using Dialogue Simulation and Reinforcement Learning. In Proceedings of the Second International Conference on Human Language Technology Research (HLT '02). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 12--19. http://dl.acm.org/citation.cfm?id=1289189.1289246

Digital Library

[39]

Saiganesh Swaminathan, Raymond Fok, Fanglin Chen, Ting-Hao K. Huang, Irene Lin, Rohan Jadvani, Walter Lasecki, and Jeffrey Bigham. 2017. WearMail: On-the-Go Access to Information in Your Email with a Privacy-Preserving Human Computation Workflow. In 30th ACM Symposium on User Interface Software and Technology (UIST 2017).

Digital Library

[40]

Long Tran-Thanh, Sebastian Stein, Alex Rogers, and Nicholas R Jennings. 2014. Efficient crowdsourcing of unknown experts using bounded multi-armed bandits. Artificial Intelligence 214 (2014), 89--111.

Digital Library

[41]

Weather Underground. 2017. A Weather API Designed for Developers. (2017). https://www.wunderground.com/weather/api/

[42]

Marilyn A Walker, Diane J Litman, Candace A Kamm, and Alicia Abella. 1997. PARADISE: A framework for evaluating spoken dialogue agents. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 271--280.

Digital Library

[43]

Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M Rojas-Barahona, Pei-Hao Su, David Vandyke, and Steve Young. 2016. Multi-domain Neural Network Language Generation for Spoken Dialogue Systems. arXiv preprint arXiv:1603.01232 (2016).

[44]

Wikipedia. 2017a. Cleverbot - Wikipedia, The Free Encyclopedia. http://en.wikipedia.org/w/index.php?title= Cleverbot&oldid=771836990. (2017). {Online; accessed 02-April-2017}.

[45]

Wikipedia. 2017b. Tay (bot) - Wikipedia, The Free Encyclopedia. http://en.wikipedia.org/w/index.php?title= Tay%20(bot)&oldid=769762463. (2017). {Online; accessed 04-April-2017}.

[46]

Jason D Williams and Steve Young. 2007. Partially observable Markov decision processes for spoken dialog systems. Computer Speech & Language 21, 2 (2007), 393--422.

Digital Library

[47]

Ian H Witten, Eibe Frank, Mark A Hall, and Christopher J Pal. 2016. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.

Digital Library

[48]

Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tür, Paul Crook, Xiujun Li, Jianfeng Gao, and Li Deng. 2017. End-to-end joint learning of natural language understanding and dialogue manager. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, 5690--5694.

[49]

Yelp. 2017. Yelp API Documentation. (2017). https://www.yelp.com/developers/documentation/ v2/overview

[50]

Tiancheng Zhao, Kyusong Lee, and Maxine Eskenazi. 2016. DialPort: Connecting the Spoken Dialog Research Community to Real User Data. arXiv preprint arXiv:1606.02562 (2016).

Cited By

Cannanure VNgoon TWolf SJasińska KBrown TOgan A(2024)Understanding the Longitudinal Impact of a Chatbot to Facilitate a Virtual Community of Practice for Teachers in Rural Côte d’IvoireACM Journal on Computing and Sustainable Societies10.1145/36757622:3(1-37)Online publication date: 16-Sep-2024
https://dl.acm.org/doi/10.1145/3675762
Huang SLin YHe ZHuang CHuang T(2024)How Does Conversation Length Impact User’s Satisfaction? A Case Study of Length-Controlled Conversations with LLM-Powered ChatbotsExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650823(1-13)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650823
Kisso GFotrousi F(2024)Requirements Conflicts Detection: Advancing with Conversational AI2024 IEEE 32nd International Requirements Engineering Conference Workshops (REW)10.1109/REW61692.2024.00019(101-107)Online publication date: 24-Jun-2024
https://doi.org/10.1109/REW61692.2024.00019
Show More Cited By

Index Terms

Evorus: A Crowd-powered Conversational Assistant Built to Automate Itself Over Time
1. Human-centered computing

Recommendations

Evorus: A Crowd-powered Conversational Assistant That Automates Itself Over Time
UIST '17 Adjunct: Adjunct Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology

Crowd-powered conversational assistants have found to be more robust than automated systems, but do so at the cost of higher response latency and monetary costs. One promising direction is to combined the two approaches for high quality and low cost ...
Real-time conversational crowd assistants
CHI EA '13: CHI '13 Extended Abstracts on Human Factors in Computing Systems

When people work together, they converse about their current actions and intentions, building a shared context to inform their collaboration. Despite decades of research attempting to replicate this natural form of interaction in computers, the ...
Effects of interface interactivity on collecting language data to power dialogue agents
Chinese CHI '14: Proceedings of the Second International Symposium of Chinese CHI

Conversation is one of the easiest modes of communication. Interactive dialogue agents are promising natural interfaces for people to interact with machines. The building of these agents, however, suffers from lacking quality language data for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

April 2018

8489 pages

ISBN:9781450356206

DOI:10.1145/3173574

General Chairs:
Regan Mandryk
University of Saskatchewan, Canada
,
Mark Hancock
University of Waterloo, Canada
,
Program Chairs:
Mark Perry
Brunel University London, UK
,
Anna Cox
University College London, UK

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 April 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Honorable Mention

Author Tags

Qualifiers

Research-article

Conference

CHI '18

Sponsor:

SIGCHI

CHI '18: CHI Conference on Human Factors in Computing Systems

April 21 - 26, 2018

Montreal QC, Canada

Acceptance Rates

CHI '18 Paper Acceptance Rate 666 of 2,590 submissions, 26%;

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

48
Total Citations
View Citations
852
Total Downloads

Downloads (Last 12 months)99
Downloads (Last 6 weeks)10

Reflects downloads up to 14 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Cannanure VNgoon TWolf SJasińska KBrown TOgan A(2024)Understanding the Longitudinal Impact of a Chatbot to Facilitate a Virtual Community of Practice for Teachers in Rural Côte d’IvoireACM Journal on Computing and Sustainable Societies10.1145/36757622:3(1-37)Online publication date: 16-Sep-2024
https://dl.acm.org/doi/10.1145/3675762
Huang SLin YHe ZHuang CHuang T(2024)How Does Conversation Length Impact User’s Satisfaction? A Case Study of Length-Controlled Conversations with LLM-Powered ChatbotsExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650823(1-13)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650823
Kisso GFotrousi F(2024)Requirements Conflicts Detection: Advancing with Conversational AI2024 IEEE 32nd International Requirements Engineering Conference Workshops (REW)10.1109/REW61692.2024.00019(101-107)Online publication date: 24-Jun-2024
https://doi.org/10.1109/REW61692.2024.00019
Caffaro FRizzo G(2024)Knowledge-Enhanced Conversational AgentsJournal of Computer Science and Technology10.1007/s11390-024-2883-439:3(585-609)Online publication date: 22-Jul-2024
https://doi.org/10.1007/s11390-024-2883-4
Correia AGrover ASchneider DPimentel AChaves Rde Almeida MFonseca B(2023)Designing for Hybrid Intelligence: A Taxonomy and Survey of Crowd-Machine InteractionApplied Sciences10.3390/app1304219813:4(2198)Online publication date: 8-Feb-2023
https://doi.org/10.3390/app13042198
Zhao YZhu ZChen BQiu S(2023)Leveraging Human-AI Collaboration in Crowd-Powered Source Search: A Preliminary StudyJournal of Social Computing10.23919/JSC.2023.00024:2(95-111)Online publication date: Jun-2023
https://doi.org/10.23919/JSC.2023.0002
Ma YAbbas TGadiraju U(2023)ContextBotProceedings of the 34th ACM Conference on Hypertext and Social Media10.1145/3603163.3609031(1-14)Online publication date: 4-Sep-2023
https://dl.acm.org/doi/10.1145/3603163.3609031
Xiao ZLiao QZhou MGrandison TLi Y(2023)Powering an AI Chatbot with Expert Sourcing to Support Credible Health Information AccessProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584031(2-18)Online publication date: 27-Mar-2023
https://dl.acm.org/doi/10.1145/3581641.3584031
Dey SDuff BKarahalios K(2023)It Is All About Criticism: Understanding the Effect of Social Media Discourse on Legal Crowdfunding CampaignsProceedings of the ACM on Human-Computer Interaction10.1145/35795007:CSCW1(1-37)Online publication date: 16-Apr-2023
https://dl.acm.org/doi/10.1145/3579500
Lei KYu MLewellen MKu VLee D(2023)Compass: Supporting Large Group Mentorship in a Chat-Based UIProceedings of the ACM on Human-Computer Interaction10.1145/35794707:CSCW1(1-25)Online publication date: 16-Apr-2023
https://dl.acm.org/doi/10.1145/3579470
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents